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Legal Restrictions, “Sunspots," and Peel's 
Bank Act: The Real Bills Doctrine versus the 
Quantity Theory Reconsidered 
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This paper considers two questions: (i) what is the purpose of legal 
restrictions intended to separate “money" from "credit markets,” 
and (ii) is such a separation desirable? It is argued that historical 
legal restrictions meant to achieve such a separation were designed 
to preclude the occurrence of sunspot equilibria. It is also shown that 
a coherent model can be constructed in which sunspot equilibria 
exist in the absence of legal restrictions, but not if money and credit 
markets are separated. Nevertheless, there is no obvious wellare 
justification for such a separation. 


i 

|I. Introduction 

{ ’'‘One oi the oldest questions in monetary economics can be stated as 
follows: Is it desirable to impose restrictions on an economy that have 
the effect of “separating” money from credit markets? This is a de¬ 
bate that has proceeded at both a theoretical and a practical policy- 
pnaking level. From the standpoint of actual policy formulation, this 
I debate had perhaps its most impressive consequences in the form of 
| Peel’s Bank Act of 1844. The passage of this act represented a legisla¬ 
tive victory for advocates of the view that it is economically desirable 
| to have money and credit markets made distinct by r legal restrictions 
•on the trades of private agents (as well as the government). 


In writing this paper I have benefited from the comments of Costas A/ariadis and an 
|anonytnous referee, and from discussions with Valerie Bencivcnga, Scott Kreetnan. and 
fDavid Laidlcr. I retain responsibility for all errors. 
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There are several reasons to take an interest in the debate over 
whether money and credit markets should be separated by regula¬ 
tion. One is that the issues involved in the debate are intrinsically 
interesting. In particular, proponents of what Sargent and Wallace 
(1982) have identified as the quantity theory have argued that mone¬ 
tary regimes in which money and credit markets are not separated 
permit “excessive" fluctuations in prices and in the stock of (inside 
plus outside) money. Whether or not this is true, according to some 
definition of excessive fluctuations, is itself interesting, and it is inter¬ 
esting to ask how this could be so. Second. Sargent and Wallace (1982) 
have reopened the "real bills doctrine versus the quantity theory” 
debate over whether it is economically desirable to separate money 
from credit markets, even if the quantity theorists are correct. How¬ 
ever, other notions than theirs could be used to define “excessive" 
fluctuations. It could be asked whether their conclusions hold under 
definitions of excessive price fluctuations that seem more closely re¬ 
lated to historical debates than their definition does. 

Third, a variety of papers have taken up the issue of “legal restric¬ 
tions” theories of money (see, e.g., Wallace 1983; Bryant and Wallace 
1984; Makinen and Woodward 1986; Sargent and Smith 1986, 1987). 
These papers focus on the idea that money can coexist with assets that 
dominate it in rate of return only if legal restrictions on the trades of 
private agents prevent direct "competition" between money and other 
dominating assets. Such an approach to explaining the coexistence of 
money with assets bearing superior return streams leaves the obvious 
question of why such legal restrictions might be imposed. Bryant and 
Wallace (1984) have suggested one reason: legal restrictions, via their 
ability to enhance earnings from an inflation tax, may be a desirable 
part of an optimal tax scheme. However, this explanation for these 
restrictions would not appear to be universal. For instance. Peel’s 
Bank Act simultaneously imposed restrictions designed to separate 
money from credit markets and required that all “new” Bank of En¬ 
gland note issues be backed by 100 percent gold or silver reserves. 
Hence seigniorage revenue was not at issue. Then it seems natural to 
ask whether the kinds of issues raised by the quantity theorists might 
explain the desirability of these restrictions. 

This paper will provide a partial reinterpretation of some of the 
debate over whether or not money and credit markets should be 
legally separated. In particular, the debate preceding and leading up 
to the passage of Peel’s Bank Act is briefly examined. This act is taken 
as a particularly interesting legal restrictions episode since it appears 
not to be amenable to the Bryant-Waliace (1984) rationale. It is ar¬ 
gued that important advocates of some of the restrictions embodied 
in Peel’s Act took the view that a failure to separate money from 
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credit markets could result in a situation in which prices, the stock of 
inside money, and other equilibrium quantities might display cyclical 
variation that was not due to variation in “economic fundamentals.” 
Or, in other words, it is suggested that these advocates were con¬ 
cerned about the possible existence of what are now called “sunspot 
equilibria.” They apparently further believed that the restrictions em¬ 
bodied in Peel’s Act were capable of preventing such cyclical fluctua¬ 
tions or of ruling out sunspot equilibria. Finally, it is clear that prices 
and (inside) money stocks that fluctuated according to the realization 
of some sunspot variable might be viewed as fluctuating excessively, 
so that this is a version of the quantity theory viewpoint. 

The paper will then examine whether it is possible that there are 
economies that (a) have sunspot equilibria in the absence of legal 
restrictions but that ( b) have no such equilibria when money and 
credit markets are separated. This issue is taken up in the context of a 
simple overlapping generations model with borrowing and lending. 
The model will be kept simple and illustrative and generally resem¬ 
bles the economy studied by Sargent and Wallace (1982). It will be 
seen that it is possible to find economies that satisfy points a and b 
above. Hence restrictions such as those implied by Peel’s Act can lie 
rationalized (for certain kinds of environments) as ruling out classes 
of equilibria. This justification of legal restrictions resembles Wallace’s 
(1981) argument that the design of monetary regimes can be under¬ 
taken so as to rule out certain equilibria, although Wallace designed a 
regime that prevents equilibria in which the value of money con¬ 
verges to zero. 

This line of analysis bears also on the questions raised by Sargent 
and Wallace and the quantity theorists. Specifically, the quantity theo¬ 
rists argued that a failure to separate money from credit markets 
could result in excessive price fluctuations and even in price level 
indeterminacy. Clearly, the fact that sunspot equilibria may exist 
when money and credit markets are not separated may be viewed as a 
validation of such claims. Moreover, the imposition of Sargent and 
Wallace’s "quantity theory regime” can rule out such equilibria. This 
might be viewed as a justification for imposing restrictions meant to 
separate money from credit markets. However, as is the case in Sar¬ 
gent and Wallace, it will be seen that there is no welfare sense in which 
it is obviously desirable to impose such restrictions. 

The remainder of the paper is organized as follows. Because the 
paper seeks to motivate restrictions that resemble those imposed by 
Peel’s Act, Section II examines some of the arguments proffered in 
favor of the kinds of regulations imposed. It will be seen that some 
advocates of the separation of money and credit markets were, in fact, 
concerned about “nonfundamental” cycles, or “sunspots." Section HI 
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lays out a model quite similar to that employed by Sargent and Wal¬ 
lace (1982) to examine the same questions that they took up. This 
model is then used to show that sunspot equilibria can arise under 
what Sargent and Wallace referred to as a laissez-faire regime and, 
moreover, that legal restrictions of the type contemplated in Peel's 
Act can eliminate these sunspot equilibria. Finally, a welfare analysis 
of monetary regimes with and without legal restrictions is undertaken 
in the context of an example. Section IV presents conclusions. 

II. Peel’s Bank Act 1 

Peel’s Bank Act of 1844 divided the Bank of England into an “Issue 
Department” and a “Banking Department," which were to behave as 
separate entities. In particular, the Banking Department was pre¬ 
cluded from paying out notes other than those acquired as a result of 
transactions with the Issue Department. The Issue Department was 
allow ed to issue £14 million of notes backed by government debt, but 
beyond this all note issues were required to be made only in exchange 
for gold or silver. Thus Bank of England note issues had 100 percent 
marginal gold or silver backing. The act also placed a limit on the 
ratio of silver to gold reserves for the Bank. Other note-issuing banks 
had their issues frozen at an authorized level equal to average circula¬ 
tion over a specified period of 1843. Banks not issuing notes in 1844 
were precluded from any such issues. These arc the provisions of the 
act relevant to the discussion here. The act did not restrict the behav¬ 
ior of hanks with respect to deposits. 

Peel's Bank Act, then, was meant to separate “money” markets 
from other “credit” markets, even to the extent of doing so for the 
Bank of England. It was meant also to prevent fluctuations in hank 
note issues. The act embodied proposals made over several years by 
economists of the so-called Currency School. The remainder of this 
section examines some of their arguments in favor of the kinds of 
legal restrictions imposed. 

As stated by White (1984, p. 53), after the resumption of gold 
convertibility for Bank of England notes in 1819, 

the overriding concern of British writers on money and 
banking was with the causes of and cures for the general 
business fluctuations that repeatedly beset the economy. . . . 

The Currency School of Samuel Jones Loyd and J. R. 
McCulloch held a partly monetary theory of trade cycle cau- 


1 Most of the subsequent discussion is culled from Fetter (1965) and White (1984). 
For a more complete description of the actual act, see p. 185 in Fetter or p. 76 in White. 
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sation that charged both the Bank of' England and especially 
the country banks with disequilibrating conduct in issuing 
bank notes. . . . 

As a cure the Currency School advocated centralizing con¬ 
trol of the currency supply by eliminating the country banks’ 
right of issue. 

In fact, a number of these Currency School writers advanced theo¬ 
ries of the cycle that depended on “nonfundamental” causes of fluctu¬ 
ations. According to Fetter (1965, pp. 129-30), the so-called currency 
principle (that notes should fluctuate in the same way as would a 
purely metallic currency) was based on the idea “that all paper money 
was a disturbing element in the economy because of the sudden ex¬ 
pansion and contraction that it caused in the total monetary supply.” 
it was felt that “aside from bank failures or bank runs, the issuing of 
notes on fractional reserves, even by banks of unquestioned solvency, 
caused undesirable increases and decreases in the monetary supply." 

Fetter identified the first organized statement and defense of the 
currency principle in the writings of James Pennington, whom he 
quotes (pp. 130—31): “are there, then, no means to be found, of 
preventing those alterations of excitement and depression—of ex¬ 
travagant expectation and disappointed hope—but in the exclusive 
employment of so expensive a medium of interchange as gold, and 
the suppression of paper?” This anticipated subsequent theories of 
cycles driven by expectations. White (1984, p. 108) referred to McCul¬ 
loch’s account of the cycle: that “an initial equilibrium is disturbed by 
an increase in the demand for bank loans for speculative purposes. 
This prompts increased country bank issues. The growing money 
stock, rising prices, and speculation feed upon one another." Loyd is 
quoted by While (p. 109) as arguing that "so long as human nature 
remains what it is, and hope springs eternal in the human breast, 
speculations will occasionally occur, and bring with them their atten¬ 
dant train of alternative periods of excitement and depression." Fi¬ 
nally, Longfield, another member of the Currency School, "also ad¬ 
vanced the notion that business cycles were driven by waves of 
optimism and pessimism. The wave of optimism was first swollen by 
liberal bank lending, and then dashed on the rocks of the credit 
contraction prompted by external drain” (p. 109). 2 

It will subsequently be useful to explore these views somewhat fur¬ 
ther. According to White (pp. 124-25), 

1 Ihese “sunspot" theories of the cycle appear not to have been unique to the Cur¬ 
rency Sc hool. Thomas Tooke. a member of the so-called Banking School, had a view of 
a “boom [which] might lie financed entirely by an undue expansion of trade credit 
prompted by ‘the excess of confidence' " (White 1984, p. 110). 
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The Currency School criticized the tendency of the pro¬ 
vincial fnote] circulations to increase with high prices. 
George W. Norman . . . and Samuel Jones Loyd . . . viewed 
such an issue of notes ... to be unsound. Their argument 
always regarded the high prices in question as too high .... 
Increased provincial issues would . . . [sustain] the high 
prices .... It is not immediately clear what they believed to 
be the origin of excessively high prices, although Norman 
associated them with “periods of excitement,’’ so that they 
were likely thinking of .. . their trade cycle theory, in which 
an unexplained wave of optimism sweeps the economy. 

It will be useful for future reference to notice that high prices “pro¬ 
duce” high volumes of inside indebtedness according to this argu¬ 
ment. It will also be useful to note that, in some versions of the 
Currency School view, the rate of interest was not driven down by 
large note issues. According to Joplin (quoted by White, p. 100), “the 
interest of money, when it is abundant, is not reduced, but the circula¬ 
tion ... is diminished." This is of interest since it is consistent with a 
view in which cycles are driven by sunspots, as will be seen subse¬ 
quently. 

Prior to concluding this section, it is appropriate to raise some 
issues that are relevant to the modeling strategy employed below. 
First, the economists referred to above were concerned about bank 
note issues against the background of an economy on a specie stan¬ 
dard. The model presented below' will analyze the issues raised here 
in the context of an economy with a fixed stock of hat money. This is 
the modeling approach of Sargent and Wallace (1982), which is also 
adopted here. Moreover, one could think of the model as one with a 
fixed stock of specie. 3 However, it would be interesting in future 
research to examine the questions raised here in a model that per¬ 
mitted specie stocks to vary endogenously. 

Second, the model below ignores banks, focusing instead on inside 
indebtedness. In this the analysis simply follows Sargent and Wallace 
(1982), and their defense of this approach could be adopted. 

Third, Peel’s Bank Act ignored bank deposits. According to Fetter 
(1965, p. 132), the arguments in favor of the currency principle indi¬ 
cated an unawareness of “the monetary significance of bank depos¬ 
its.” The analysis here will follow these arguments in that the model¬ 
ing strategy is to act as if Peel’s Act could successfully separate money 

3 In particular, it would be easy to modify the commodity money model of Sargent 
and Wallace (1983) to produce an economy that functions similarly to the one discussed 
below. Specifically, if their “technology” is specified in a way that rules out gold produc¬ 
tion, then the Sargcnt-Wallace model behaves much like the model of Sec. III. 
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and credit markets. Whether in fact it could have done so while ignor¬ 
ing bank deposits is a separate issue, which is not taken up here. 

Fourth, the model presented will be a model of an exchange econ¬ 
omy. This prevents sunspots from affecting the level of economic 
activity, which accords well with what Fetter points to (p. 142) as a 
common “assumption that changes in prices affected the distribution 
of income, but had little if any effect on the total of income and 
employment.” Thus the model of an exchange economy accords well 
with the spirit of some of the nineteenth-century debate. 

III. Sunspots and Peel’s Bank Act 

The purpose of this section is to provide a simple model illustrating 
the following three points: (i) It is possible to construct a model in 
which the cyclical “analysis” discussed in Section 11 is correct, (ii) The 
nonfundarnental cycles that arise in the absence of legal restrictions 
can be precluded (under certain circumstances) by the imposition of 
legal restrictions resembling those imposed by Peel’s Bank Act. (iii) 
Nevertheless, there is no obvious presumption in favor of the desira¬ 
bility of such restrictions. In this section such a model is described. 

A. The Mudel 

The model consists of an infinite sequence of two-period-lived, over¬ 
lapping generations plus a set of initial old agents. Let I index time. 

with / = 0. 1.Then at each date t two generations are present: 

one that is young at t and one that is in its final period. 

Each young generation is identical in si/e and composition and 
consists of two groups of agents. Group 1, which has X.V members, 0 
< X < 1, will be called "savers," and group 2, which has (1 - X)\ 
members, will be called “borrowers." All members of group » (i = 1. 
2) are identical. 

At each date all agents possess some endowment of a single non- 
produced, nonstorable consumption good. Let r t denote consump¬ 
tion when young, and r*. old-age consumption. Then each saver has 
preferences described by the utility function W '(O. < 2 ) = //(fi) + o», 
with //' ^ 0, //" < 0 V ci E R which is taken to be the consumption 
set at each date. 4 The endowments of these agents are given bv the 
vector (y, 0), y > 0. Each borrower has preferences described by the 


Clearly, this specification of the preferences of savers is restrictive. If lV(ri. r 2 ) fails 
to be linear in r a , however, borrowers will Ixtrrow and accumulate mone\ balances in 
any nontrivial sunspot equilibrium. Construction of such equilibria then becomes ex¬ 
tremely elaborate. Therefore, simplicity dictates that W'tr,. c*) be linear in i a . 
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utility function U(c i, c>>) and an endowment stream given by the vector 
(ah, rt'o); u'i a 0, w a > 0. The function t/(ci, c 2 ) is assumed to be twice 
continuously differentiable, strictly increasing in each argument, and 
strictly concave. Notice that preferences and endowments are non¬ 
stochastic. 

These agents trade goods for fiat currency and for promised repay¬ 
ments of goods at future dates (i.e., they engage in borrowing and 
lending). At each date there is a fixed and constant per capita stock of 
fiat money M outstanding. Without loss of generality, let M — 1. 
Notice that this quantity is also nonstochastic. 

There is no “fundamental'’ randomness in this economy then. 
However, suppose that there is some set of random events that can 
occur at each date. E. For our purpose, it suffices to let £ = {1,2} and 
to let e E E index events. Thus, for instance, e - 1 may be the event 
“no sunspots,” while r = 2 is the event “sunspots.” The state of na¬ 
ture evolves according to a two-state Markov chain, as in Azariadis 
(1981). Then if e is “today’s state" and e' is “next period’s state," let 
prob{c' = 1 « q{e), r — 1,2. 

It is now possible to provide some notation for agents' trades and 
for the prices at which they trade. Fiat currency trades for the con¬ 
sumption good at rate S,(f ) at t; that is. ,S',(r) is the inverse price level at 
t as a function of the current-period state. Below only savers will 
accumulate currency. Let m,(e) be the (per capita) demand for real 
balances by savers at /, which again may depend on the current state. 
Also, let x,(e) denote the real (per capita) provision of loans bv savers. 

A loan of one unit at time t repays R,(e) units at time t ■+ 1 (where e 
refers to the time t state), so that R,(e) is the (gross) rate of interest. 
Finally, let the quantity borrowed by a representative borrower at t be 
given by L,(e). 

Below equilibria will be examined in which S t (e) and R,(e) —and, 
hence, L t (e), x,(e), and m t (e) —are independent of /. Hence we follow 
Azariadis (1981) in focusing on stationary equilibria. 

B. Behavior of Agents 

Savers who are young in state e choose x,{e) and m,(e) to solve the 
following problem: 

max H[y - m,{e) - x,(r)] + q(e)\R,(e)x,(e) + m,(e) 

+ [1 - q(e)]] [ R l (e)x t {e) + m t (e)j 

subject to m,(e) ar 0 and 

( 1 ) l 

l 



x,(e) + m t (e) £ y. 
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Notice that these agents (as will borrowers below) have rational expec¬ 
tations and take q{e), R,(e), and {£,(<?)} as parametric. If //'(c) > 0, the 
expected utility maximization problem for savers has the associated 
first-order conditions 

H'[y - x t (e) - m^e)] - R,(e), e - 1,2, (2) 


H'[y - x,(e) - m,(e )] = q(e) -gffl 


+ [1 - q(e)J 


S,+ 1 ( 2 ) 
S,(e) ' 


(3) 


e — 1 , 2 , 


where clearly only equilibria in which 0 < x,{e) < y and 0 < m,(e) < y 
- x,(e) are considered. 

An immediate implication of (2) and (3) is that loans and real bal¬ 
ances earn identical expected rates of return: 

q {t) ^SM L * fl ~ q(,)] %, (f~ = R ' (e) ' t -°’ e=L2 - (4 > 

Also, by using equation (2), a savings function for type 1 agents can be 
defined. Let F[R (e)] denote the optimal savings function of type 1 
agents: x,(e) + m,(e) = /'’(/?,(<’)] for all e, t , and /-'[/?(e)l is defined 
implicitly by 5 

H'{y - F[R,m = R,(e). (5) 

Then F'(R) = -H"[y - F(R)]~ 1 > 0 (if H” < 0). It is assumed that 

F( 1) > 0. 

Borrowers who are young in state e solve the problem 

max + L,(e), u' 2 - R,(e)L,(e)] 

subject to - u>\ ^ L t (e) < w^R,{e) "'. The first-order condition associ¬ 
ated with this problem is 

+ L,(e), a< 2 - R,(r)L,(e)\ = R t (e)U 2 [u' ] + L,(e), ti> 2 - R,{e)L,(f)). 

( 6 ) 

where f/, denotes the partial derivative of V with respect to its ;th 
argument. Denote the value of L,(e) solving (6) by l.,(c) » Q[/?,(e)J: that 
is, Q[/L(r)| is the demand function for loans by borrowers. It is as¬ 
sumed that £>(1) > 0. 

It will also be assumed below that (T(l) > 0. This assumption plays 
the same role that the assumption that the aggregate savings function 
is decreasing in the (gross) rate of interest (at one) plays in A/ariadis 


■* If H' 


0, as will be the case in an example below, then F(H) » y. 
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(1981). The assumption that Q’( 1) > 0 is equivalent to the assumption 
that 

(-'■>(•) + Itfia(-) - < 0. (7) 

where the functions on the left-hand side of (7) are evaluated at R 
= 1 . 


( Steady-State Equilibrium 

A steady-state (nonstochastir) equilibrium (with valued fiat money) is 
a pair of values (.S'. R) such that R = I and 

s = mi) - (l - A)Q(l). (8) 

As is implicit in the notation. S and R are independent of the state of 
nature e. h 


1). Stationary Sunsf>ot Equilibria 

As in Azariadis (1981), a stationary (sunspot) equilibrium is a set of 
values (g(l). ^(2), /?(1), R( 2), S(l), S(2)l satisfying 

S{r) = \E[R(e)] - (1 - A)Q[/«e)]. e = 1,2, (9) 

(4), and 0 £ q{e) £ 1, e - 1,2. 

It is now possible to provide a set of conditions under which station¬ 
ary sunspot equilibria exist for the model at hand. Define the function 
4>(x) by 

<*>(*) = x\\F(x) - (1 - A)Q(x)J (10) 

for all x such that 0 £ \F(x) — (1 - k)Q(x) £ Ay. By the assumption of 
note t>, <h(l) > 0. Assume further that $'(1) < 0, which is equivalent 
to assuming that 

(1 - A)Q'( 1) - AF(1)>5 = AT( 1) - (1 - A)Q(l). (11) 

Then the following proposition is straightforward. 

Proposition. If <£'(1) < 0, then there exists an open neighborhood 
of one, N t (l), such that, for all f?(l), R( 2) £ A^,(l) satisfying R (1) < 1 
< R (2), there exist values [^(1), q( 2), 5(1), 5(2)] satisfying (4) and (9). 
That is, for all /?(1), R( 2), with /?(1) < 1 < R( 2) and with R( 1) and 
R( 2) sufficiently close to one, there exist positive values S(e) - 
\F[R(e)] - (1 - A)£?[/I(e)] with 5(1) > 5(2) and values [^(1), ^(2)J 

6 As in Azariadis (1981), Q{R) is single-valued. It is assumed that F{R) is single-valued 
and also that ATO) > (1 - k)Q(l). Thus there is a unique steady-state value S > 0. 
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such that setting = S,(e) and x,(e) = (1 - X)Q[fl(*)]/X is a 

solution to (2) and (3), e = 1,2. 

Remark. —Condition (11) is just that (minus) the elasticity of aggre¬ 
gate savings with respect to the rate of interest, evaluated at R = 1, 
exceeds one. 

The proposition can be proved by choosing arbitrary values R( 1), 
R(2) E N t ( 1) for some c > 0 that satisfy R( 2) > 1 > /?(1) and then by 
constructing equilibrium values [^(1), 7(2), S(l), S(2)]. This is now 
done. To begin, choose R(l) < 1 < /?(2) such that 

0<Q[K(1)]<Q(1)<Q[K(2)). (12) 

For R(\) and R( 2) sufficiently near one, this is possible since Q( 1) > 0 
and Q'(l) > 0. Then set 

5(1) = XF(tf(l)J - (1 - X)Q[tf(l)], 

5(2) - XF[i?(2)J - (1 - X)Q[7?(2)J. ° j 

For /i(l) and R( 2) sufficiently near one, it is the case that 5(1) > 5(2) 
since R( 2) > 1 > K(I) and XF'(l) - (1 - X)(T(1) < 0. Thus for 
appropriately chosen € > 0. 5(1) > 5(2) > 0. 

It remains, then, to construct 7(1) and 7(2) and to show that these 
are probabilities. Letting 5(1) and 5(2) lie the values given by (13), set 
7 ( 1 ) and 7 ( 2 ) as follows: 


= Ad) ~ [3’(2)/5(l)l 
1 1 - [5(2)/5(l)] ’ 

7(2) = -SElZ-i—. 

' (S(l)/5(2)] - 1 


(14) 


(15) 


It is clear that, by construction, the values [R( 1), R( 2), 5(1), 5(2), 7 ( 1 ), 
7 ( 2 )] satisfy (4). Then it is necessary only to check that these values for 
7 ( 1 ) and 7 ( 2 ) satisfy 0 < q(e) < 1 , e = 1 , 2 . 

It is apparent from (14) and the fact that R( 1) < 1 that 0 < 7 ( 1 ) < 1 
if ft(l) > 5(2)/S(l). Moreover, by equation (13), R( 1) > S(2)/S(l) is 
equivalent to the condition 

fl(l){XF[K(l)] - (1 - X)Q[K(1)]} > XF[/J(2)] - (1 - X)Q[K(2)]. 

(lb) 

Similarly, 0 < 7 ( 2 ) < 1 if 5(1)/S(2) > R( 2). By equation (13), S(l)/S(2) 
> R (2) is equivalent to the condition 

«(2){XF[/?(2)] - (1 - X)Q[/?(2)]} < \F[R( 1)1 - (1 - X)Q[R(1)]. 

(17) 
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If is now easy to check that (16) and (17) are satisfied for e 
sufficiently small. To prove that (16) holds, notice that, since KF'(l) - 
(1 - X)<2'(D < 0and/?(2) > 1, 

XF(1) - (1 - X)G(l) > \F[R( 2)] - (1 - k)Q\R( 2)] (18) 

if R(2) is chosen sufficiently close to one. Then (16) is satisfied if 

K(1){XM*(1>] - (1 - X)Q[*(1)]}>XF(1) - (1 - X)Q(l). (19) 

But (19) is just 4>[K(1)3 > <t»( 1). Since /i(1) < 1, (19) is therefore 
satisfied if /?(1) is chosen sufficiently close to one since *£’(1) < 0. 
Thus (16) is satisfied for e sufficiently small. 

Similarly it is easy to check that (17) is satisfied for appropriately 
chosen e > 0. In particular, since \F'( 1) - (1 - X)£>'(1) < (land R( 1) 
< 1 , 


X/Wl)] - (1 - X)Qf«(l)l > kF(l) ~ (1 - \)Q( 1) (20) 

if /?(1) is chosen sufficiently near one. But then (17) holds if 

XF(1) - (1 - X)Q(l) > R{2){\F[R(2)] - (1 - X)(>[K(2)]}. (21) 

Of course, (21) is just 4>(I) > 4>[/?(2)]. Since R( 2) > 1 and <t>'(l) < 0, 
(21) holds if R( 2) is sufficiently near one. Thus, for appropriately 
chosen e > 0, (16) and (17) are satisfied, implying that 0 < q(e ) < 1, e 
= 1,2. This completes the proof. 

It is straightforward, then, to construct sunspot equilibria here if 
<#»'( 1) < 0. Moreover, in at least some regards these equilibria validate 
the Currency School arguments given in the previous section. Clearly, 
at least relative to fundamentals, prices are "excessively variable," and 
cycles are not driven by fundamentals. However, even in more quan¬ 
titative regards the equilibria validate some of the propositions 
pointed out in Section II. In particular, it was argued by the Currency 
School that the stock of inside money (debt) expanded with high 
prices. This is clearly true here since e = 2 is the high-price state— 
5(1) > 5(2), and S(e) is the inverse price level—and Q(/f(2)1 > 
£)[/?(l)]. 7 It was also seen that some members of the Currency School 
argued that when “money is scarce” (presumably meaning that inter¬ 
est rates are high), “an enlargement of issues takes place.” This is also 
the case here since again Q[I?(2)] > £)[#(!)]. 


7 The identihcaiion of loans with inside money here mimics the usage in Sargent and 
Wallace (1982). 
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i. Peel’s Bank Act 

largent and Wallace (1982) presented some legal restrictions meant 
o resemble Peel’s Bank Act in their implications. 8 In their paper, 
hese restrictions took the form of preventing savers from making 
oans in quantities less than some minimum amount, say x, where x is 
■xpressed in units of goods. Then set x at any value such that x > y. 
iere this separates money and credit markets since savers are now 
jrecluded from lending. Then an equilibrium requires that 

H'[y - k~'S(e) J = g(e)^ + [1 - e = 1.2. (22) 

Q[K(e)] = 0, e — 1, 2. (23) 

Since t/(r )f c 2 ) is strictly concave, clearly (23) implies a constant value 
if R. Similarly, (22) implies a constant value of 5. To see this, suppose 
hat 5(1) > 5(2). Then, from (22), 


H'[y - k~ 5(1)] = q(l) + [1 - q(l)) 


5(2) 

5(1) 


tnd 


5(1) 


H'ly - k 5(2)] = + 1 - ?(2). 

Solving (24) and (25) for ^(1) and ^(2) gives 

H'[y - k 1 5(1 )1 - [5(2)/5( 1)] 


<7(1) = 


<7(2) - 


1 - |S(2)/S(1)1 
H'[y - k *5(2)1 - 1 
[5(l)/5(2)] - 1 ' 


(24) 


(25) 


(26) 


(27) 


Since 5(1) > 5(2), ^(1) is a probability iff 1 a H'[y - k *5(1)] s 
>(2)/5(l). Similarly, q( 2) is a probability iff 5(l)/5(2) s H'[y - 
i ’5(2)J > 1. But then H'[y - k '5(2)3 * H'[y - A *5(1)1 or 5(2) > 
>(1), contradicting the supposition. Assuming 5(2) >5(1) leads to an 
dentical contradiction. Thus 5(1) = S(2) must hold. 

It is the case, then, that restrictions separating money from credit 
narkets will prevent sunspot equilibria for the model of this section. 
■ hen such restrictions also prevent prices and the stock of (inside) 
noney from fluctuating "excessively.” 


This is not stated explicitly in Sargent and Wallace (1982): see Sargent and Smith 
1987, p. 80. n . 2). 
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F. An Example 

An example is now presented in which, in the absence of legal restric¬ 
tions. sunspot equilibria exist. The legal restrictions of Section Illii 
preclude the existence of such equilibria. The example permits some 
investigation of whether there is any obvious welfare justification for 
legal restrictions that separate money from credit markets. 

Example l. —Let type 1 agents (savers) have the utility function 
W(fi, r 2 ) - <2 and endowment stream (y, 0), with y = 5. Let type 2 
agents (borrowers) have the utility function 

f 7 (<"i,c 2 ) - c i - V»(c\ + .5c 2 ) 2 + .5145c 2 - .005r|. 

It is straightforward to verify that {/(•) is concave, and UiUi, c 2 ) and 
U 2 (ci, c 2 ) w iH be positive in the equilibria constructed. The endow¬ 
ment stream of borrowers is (0, tti 2 ), with = 4 and X = 1 - X = '/a. 

For these preferences and endowments, the demand function tor 
loans of type 2 agents is given by 


Q(R) = 


.5 - .2245 R 
.25(1 - R) + ,0725ft 2 ' 


It is now possible to construct the steady-state (monetary) equilib¬ 
rium for this economy. In particular, ft = 1,Q(1) = 3.8, and S = •/»(y 

- Q(l)] = -6. The utility of type 1 agents in this equilibrium is 5, and 
the utility of type 2 agents is U( 3.8, .2) = 2.00145. 

It is also possible to show that the following values constitute a 
stationary sunspot equilibrium: ft(1) = .98, ft(2) = 1.02, 5(1) = Vv{y 

- 0*<1)]} = -62412, 5(2) = 'My - Qt«(2)]} = .57601, q{ 1) = 
.74049, and ^(2) = .23946. In particular, these values for </(l) and 
^(2) are given by equation (4). Finally, under the regime that sepa¬ 
rates money from credit markets, ft (e) - Q -1 (0) = 2.227, and S — ky 
= 2.5. 

It is now possible to ask whether, in the context of this example, 
there is any obvious sense in which it is desirable to separate money 
from credit markets via legal restrictions. An answer to this question 
requires a definition of Pareto optimality. Here several definitions are 
possible. One definition would follow Muench (1977), Peled (1982), 
and Cass and Shell (1983) in using a notion of conditional Pareto 
optimality or, in the terminology of Cass and Shell, dynamic Pareto 
optimality. According to this welfare criterion, the Peel's Bank Act (or 
quantity theory, as in Sargent and Wallace [1982]) regime dominates 
the laissez-faire regime only if it results in every agent’s being better 
off in every possible sequence of states. According to this criterion, 
the two regimes are not comparable since clearly I/{(£(ft(2)], W'i 



legal RESTRICTIONS 17 

- i?(2){)[/i(2)]} > (/((), wz), while savers young in period t who expe¬ 
rience the sequence of states (e„ e,+ i) — (1,2) are worse off than they 
would be if money and credit markets were separated. (The initial old 
are always better off in the quantity theory than in the laissez-faire 
regime.) Thus this criterion will clearly not provide a general ration¬ 
ale for the legal separation of money and credit markets. 

Another criterion that might be considered would evaluate the two 
regimes according to the unconditional expectation of steady-state 
utility. Specifically, define qby q ** q(\)q + ^(2)(1 - q), so that q is the 
unconditional probability of the event e, = 1. Then let us say that it is 
desirable to separate money from credit markets iff 

- R (1 )£>[*( 1 )]} 

(28) 

+ (1 - q)U{(l[R(2)], W2 - fl(2)Q[/?(2)]} £ U(0, w 2 ) 

and 


q(D + [1 - </0)] 


+ (1 - 7 ) [?(2) fjlj- + 1 - q( 2)1 = qR( 1 ) + (1 - q)R( 2 ) =£ 1, 


(29) 


where either (28) or (29) holds with strict inequality.” 

Clearly if 5(1) ¥=■ S( 2), then (28) fails since C'(Ql/?(l)], - 

R(l)Q[R(l))} > iHQLRm W 2 - fi(2)0«(2)]} > V(0, w 2 ). Thus this 
criterion will not provide a rationale for the separation of money 
from credit markets. Moreover, this point is not specific to the ex¬ 
ample. However, in the context of the example, a stronger point can 
be made: both (28) and (29) are violated by the equilibrium con¬ 
structed above. In particular, for the stationary sunspot equilibrium 
constructed above, q = .47991 and qR(\) + (1 - q)R(‘2) = 1.00079. 
Thus the unconditional expected utility of all young agents is lower 
when money markets are separated from credit markets than under 
laissez-faire. In short, the possibility of “excessive fluctuations'’ in 
prices is not sufficient to imply the desirability of separating money 
from credit markets, even if such a separation can prevent these 
nonf u ndamen lal fl uctu at ions. 10 


II Notice that this definition ignores the welfare of the initial old. 

III Since the initial old (or the initial old who have positive money holdings) benefit 
from the imposition of the legal restrictions that separate money from credit markets, it 
is easy to see why such restrictions might be imjtosed nonetheless. Moreover, this point 
is not specific to the cxamplr; since a separation of money and credit markets leads to 
higher equilibrium levels of real balances, the initial old will favor restrictions that 
accomplish this. 



i8 

IV. Conclusions 


JOURNAL OF POLITICAL. ECONOMY 


Despite the existence of a significant literature about “legal restric¬ 
tions theories" of money, the only previously existing rationale for 
such restrictions was that they might be part of an optimal inflation 
tax scheme (Bryant and Wallace 1984). However, as has been seen, 
some legal restrictions meant to separate money from credit markets 
were accompanied by restrictions meant to prevent the raising of 
seigniorage revenue. What, then, was the purpose of these restric¬ 
tions? 

It has been argued that some proponents of such restrictions were 
concerned about the possibility of nonfundamenta) cycles if money 
and credit markets were not separated. It has also been argued that it 
is possible to construct coherent models in which the Currency School 
views discussed in Section II are correct, as far as they go. Moreover, 
the analysis here provides a sense in which, under a “real bills regime” 
(in the Sargent-Wallace [1982] usage) the price level is indeterminate. 
This indeterminacy can, under certain circumstances, be rectified by 
the separation of money and credit markets. These features of real 
bills regimes have often been claimed by adherents of what Sargent 
and Wallace term “the quantity theory." 

Despite the fact that adherents of these views are correct under rhe 
circumstances set out above, the analysis provides little support for 
restrictions separating money and credit markets. In particular, it was 
seen that there is no obvious welfare justification for such a legal 
separation, even though Currency School and quantity theory argu¬ 
ments are borne out in the model. The possibility remains that in a 
production economy some welfare justification for these restrictions 
could be provided, although such an analysis would not be consistent 
with the flavor of much of the historical debate over the separation of 
money and credit markets. 

Another interesting extension would allow for more general forms 
of the preferences of type 1 agents. If W(ri, c 2 ) is not linear in c< 2 , then 
in general borrowers will borrow' and hold money in any nontrivial 
sunspot equilibrium. Construction of sunspot equilibria is then com¬ 
plicated considerably. However, the general conclusions would con¬ 
tinue to hold. In particular, under conditions that are easily stated, a 
separation of money and credit markets would preclude the existence 
of stationary sunspot equilibria of the Azariadis (1981) type. Specifi¬ 
cally, Azariadis discussed conditions under which single-agent models 
cannot have stationary sunspot equilibria of the type described here. 
When money and credit markets are separated in this context, the 
money market is described by a single-agent model similar to that 
analyzed by Azariadis. Thus the conditions he gave that preclude 
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these equilibria would be sufficient for a legal separation of money 
and credit markets to prevent the occurrence of sunspots of the kind 
considered above. 11 
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This paper outlines a procedure for estimating a general index of 
tec hnical change within the context of a quite general production 
technology. Specifically, when panel data are available for firms in an 
industry, time-specific dummies can be combined in a nonlinear esti¬ 
mation procedure to yield a general index of technical change that 
may be both nonneutral and scale augmenting. This approach offers 
numerons advantages over the traditional time trend tepresentation 
of technical change. For example, the general index can serve as the 
basis for analysis of the determinants of technical change. Results for 
a sample of 30 electric utilities over the period 1951-78 show that 
the productivity decline of the 1970s can be attributed primarily to 
sulphur oxide restrictions and secularly declining capacity utilization 
due to rapidly increasing peak-load demands. 


I. Introduction 

The dilemma posed by the measurement of technical change has led 
to a bifurcation of approaches along the lines of econometric estima¬ 
tion versus index numbers. 1 Beginning with Tinbergen (1942), 
econometricians have described technical change with a simple time 

We would like to thank M. Mokhtari for his helpful programming assistance and 
Randy Nelson for providing us with his data set. We also wish to thank the referee 
along with the participants of the Winter 1986 meetings of the Econometric Society for 
helpful comments. Baltagi wishes to acknowledge the financial support of the Univer¬ 
sity of Houston Energy Laboratory. 

' For a review, see Nadiri (1970) and Diewert (1981). 
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trend as part of the econometric estimation of the production or cost 
function. Advances by Christensen, Jorgenson, and Lau (1973) and 
others have led to generalized functional forms in which the technol¬ 
ogy may exhibit muld-inputs and outputs, quasi-fixed factors, nonho- 
motheticity, and so forth (see, e.g., Stevenson 1980; Caves, Christen¬ 
sen, and Swanson 1981; Gollop and Roberts 1983; Nelson and Wohar 
1983). Likewise, the treatment of technological change has under¬ 
gone considerable generalization through the introduction of qua¬ 
dratic terms in time and interactions of the time trend with factor 
input prices and output, enabling technical change to increase at non¬ 
constant rates and to be both nonneutral and scale augmenting (see, 
e.g., Gollop and Jorgenson 1980; Jorgenson and Fraumeni 1981; 
Gollop and Roberts 1983). While these advances have loosened the 
straitjacket within which technical change is posited to conform, tech¬ 
nical change is still driven primarily by linear and quadratic terms in 
time. 

An alternative avenue, utilizing index numbers, emanates from 
Solow’s (1957) procedure for calculating a general index of neutral 
technical change. 2 His index number approach and the subsequent 
advances by Diewert (1976) and others on superlative index numbers 
have freed technical change from the time trend straitjacket. Despite 
the recent generalizations by Oaves, Christensen, and Diewert (1982n, 
19826), the use of exact index numbers requires assuming a specific 
second-order functional form, as well as requiring that returns to 
scale be constant or decreasing. 

For panel data sets with observations across firms in the same indus¬ 
try, w'e outline a procedure that replaces the time trend with lime- 
specific dummies, enabling the estimation of a purely general index 
of technical change. A secondary purpose of this paper is to ask how 
this resulting index of technical change improves our understanding 
of the determinants of technical change in electric utilities. Section 11 
provides a brief summary of efforts to measure technical change by 
both econometric estimation and index numbers. Section Ill sets 
forth two competing econometric methods to estimate technical 
change when the underlying technology is general: the standard time 
trend model and the general index approach proposed here. Section 
IV presents the econometric results of the two approaches, examining 
the contribution of scale effects and nonneutral technical change. 
Section V contrasts the value of both the standard time trend index 
and our general index as the basis for subsequent analysis of the 
determinants of technical change. These results confirm Ciollop and 

2 For other examples, see Denison (1967), Jorgenson and Grillelies (1967), and Chris¬ 
tensen, Cummings, and Jorgenson (1980). 



22 JOURNAL OF POLITICAL ECONOMY 

Roberts’s (1983) findings on the importance of sulphur oxide restric¬ 
tions and Nelson's (1984) findings that technical change lends to be 
embodied via vintage capital ef fects. Section VI recapitulates the ma¬ 
jor findings. 

II. Overview of Econometric and Index Number 
Approaches to Measuring Technical Change 

Diewert (1981) categorized approaches to measuring technical 
change into the following four groups: econometric estimation of cost 
and production functions, Divisia indexes, exact index numbers, and 
nonparametric methods using linear programming. Because of the 
computational burden and the inadmissibility of negative technical 
change of the latter, most research effort has focused on direct 
econometric estimation or the calculation of index numbers of either 
the Divisia or exact index varieties. 


A. Econometric Estimation 

Despite the evolution of a more general measure of technical change 
in a more general production model, the first- and second-order time 
trend terms tend to dominate, producing a smooth, slowly changing 
characterization of the pace of technical change. Mansfield’s (1968) 
industry-level studies of the diffusion of new processes and products, 
in which he accepted a narrow definition of technical change as essen¬ 
tially technological progress involving advances in the state of knowl¬ 
edge, revealed highly variable rates of adaptation, not characterized 
by simple time trends. Kopp and Smith {1983) likewise found that the 
time trend is a poor proxy for the pace of innovation. 

Dissatisfaction with the standard time trend model led Stevenson 
(1980) and Gollop and Roberts (1981) to postulate more general char¬ 
acterizations using time trends. For example, in Gollop and Roberts, 
each factor input was allowed to exhibit factor-augmenting technical 
change, which grows at a constant rate based on a simple time trend. 
In other applications, the time trend has been abandoned altogether 
in favor of more direct measures of technical change such as capi¬ 
talized R 8c D expenditures (Denny, Fuss, and Waverman 1981) and 
vintage capital measures (Pescatrice and Trapani 1980; Nelson 1984). 
Despite these advances, the standard time trend remains the norm. 
Explicit measures of technical change are not readily available, except 
possibly at the firm or industry level.* 

3 While specific measures of technical change are no doubt an improvement over the 
time trend, any one measure, such as a vintage capital index, is likely to provide only a 
single-dimensional measure to a phenomenon caused by a number of factors. See, e.g., 
Sec. V. 
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Dissatisfaction with the assumption of technical change occurring at a 
constant rate prompted Solow (1957) to specify a general index of 
technical change d(<) as follows: 

Q.i — A(t)F(L, K). (1) 

Solow’s purely general index of technical change revealed quite dif¬ 
ferent growth rates in d(/) for the U.S. economy for the period 1909- 
49. Such divergences from a constant time trend are to be expected 
if we accept Solow’s definition of technical change: “a shorthand ex¬ 
pression for any kind of shift in the production function." Thus A(t) 
can reflect the effects of short-run disequilibria, as well as the long¬ 
term effects of the diffusion of new processes associated with techno¬ 
logical change. 

Solow’s general index of disembodied technical change required 
three restrictive assumptions: constant returns to scale, neutral tech¬ 
nical change, and perfect competition in both output and factor input 
markets. Under these conditions, technical change ('/') is equivalent to 
the percentage growth in total factor productivity ( ITT). Specifically, 
TFP is calculated as the difference in the percentage growth in out¬ 
puts (Q) less the percentage change in a Divisia index of inputs as 
follows: 


f = TFP s ‘*>' & 

where the Divisia input index depends on the cost share (S,) weighted 
by the percentage growth in inputs (x,). 

Unfortunately, the equivalence between total factor productivity 
and technical change quickly breaks down with more general produc¬ 
tion technologies. For example, in technologies exhibiting increasing 
returns to scale, the growth in factor productivity may be attributable 
to movements along the cost function rather than downward shifts in 
costs. Other restrictive assumptions include perfect competition and 
the neutrality of technical change. 

Diewert (1976) showed that there exists a class of superlative index 
numbers corresponding to various production technologies based on 
second-order approximations. In particular, the Tornqvist index, 
which offers a discrete approximation to the Divisia index, is based on 
a translog technology. For the case of the translog cost function, 
Diewert showed that the percentage change in costs depends on the 
share-weighted change in input prices (/’,), the percentage change in 
output weighted by the elasticity of costs with respect to output and 
technical change: 
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where the superscript * indicates observations drawn in period /*. It is 
easy to see from equation (3) that it is not necessary to estimate the 
translog function, providing constant returns to scale can Ire assumed. 
Furthermore, it is not necessary to know d In C/dt since technical 
change can be derived as the residual. Thus, if the technology is 
translog. the Tornqvist index allows partial relaxation of the three 
assumptions underlying Solow’s Divisia index. From equation (3), we 
see that knowledge of the returns-to-scale elasticity is still necessary; 
furthermore. Shephard’s lemma is utilized to generate the cost shares 
(.S',) so that perfect competition in input purchases is required. It is, 
however, no longer necessary to assume that there is perfect competi¬ 
tion in the output market* or that technical change is neutral. 

In sum, the Tornqvist index provides a very convenient mechanism 
for computing technical cfiange, avoiding any necessity to economet- 
rically estimate the production technology. The resulting measure is 
conditional on the technology’s being translog and on knowledge of 
the returns-to-scale parameter, 5 as well as the standard competitive 
assumptions. Obviously, in applications involving industries with in¬ 
creasing returns to scale, such as electric utilities, the exact index 
number approach is likely to result in biased estimates of technical 
change. Denny and Fuss (1983) illustrated that if the technology is not 
translog or if tfte second-order translog parameters differ across 
firms, the Tornqvist index can result in considerable distortion. In 
these situations, econometric estimation cannot be avoided. There is 
an additional argument supporting econometric estimation with tech¬ 
nical change explicitly embedded: Parameter estimates of the under¬ 
lying substitution elasticities and scale parameters are important in 


'* Using a cost function, we require only cost minimization and fixed input prices to 
firms. In the joint production case, perfect competition in output markets is also a 
required assumption. 

■'As shown in Caves, Christensen, and liiewert (1982«), the elasticity of cost with 
respect to output can be calculated using conventional data for the cases of both con¬ 
stant and decreasing returns to stale. For increasing returns to stale, independent 
econometric estimates are required as in Nelson and Wohar (1983). 
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their own right, and a properly specified measure of technical change 
will reduce the bias in these measures. 


III. Model Specifications 

A. Standard Time Trend Model 


For purposes of comparison, we utilize a translog cost function speci¬ 
fication that explicitly postulates a nonhomothetic technology in 
which a time trend representation of technical change may be non- 
neutral and scale augmenting as follows: 

In C = c*o + X X*/)* + la. In P, + y In (7 + 87' 

+ ‘A'XX In P, In P, + l / 2 y*(ln Q) 2 + AS*7* (4) 

+ X <f>,7' In P, + X 4#, In P, In (7 + 07" In (7, 

where C is total cost, D h are firm-specific dummies (k = 2 . m), P, 

are input prices, (7 is output, and T is a simple time trend. 

Invoking Shephard’s lemma, one obtains the familiar cost shares 
(S ,), which together with equation (4) provide the basis for estimation: 


= 


d In C 
S In P, 


= a, + X % * n {> i + + il», hi (7, 


i = 1, . 


( 3 ) 


n. 


Given estimation of the parameters in equations (4) and (5), it is 
possible to compute the rate of technical change as 

f = 8 C = 8 + 8*7' + X d>, In P, + 0 In (7. («) 


In turn, the growth in technical change can be decomposed into the 
following three components: (1) effects due to pure technical change 
(8 + 8*7 ), (2) ef fects due to nonneutral technical change (X 4>, In P,), 
and (3) effects due to scale-augmenting technical change (0 In Q). 

Furthermore, given estimates of technical change in equation (6) 
and an estimate of the elasticity of cost with respect to output (f-< q), 
by differentiating equation (4) it is possible to comjnt^e the esti¬ 
mated percentage change in total factor productivity ( FFP) as 

fFI* - -t + (1 - €<;<,)(>. (7) 

In general, since the estimated measure of total factor productivitv 
relies on the parameter estimates in equation (4), ITP will differ from 
the observed change in total factor productivity, which can lie mea¬ 
sured directly by equation (2). This paper demonstrates empirically 
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that observed and estimated indexes of total factor productivity differ 
appreciably because of the restrictive characterization of technical 
change in equation (6). ,( 

The general production technology implicit in equation (4) is pur¬ 
chased at the expense of a restrictive characterization of technical 
change. Pure technical change (8 + 8*7') will be constant or increasing 
or decreasing at a constant rate. In effect, by placing pure technical 
change in a straitjacket. the overall estimates of technical change in 
equation (6) are likely to be dominated by these two terms. Further¬ 
more, to the degree that the misspecification of pure technical change 
is correlated with other variables, our ability to decompose technical 
change is severely undermined. If output changes are correlated with 
the time trend, the effects of scale-augmenting technical change (0 In 
Q) may be attributed to pure technical change and vice versa. Simi¬ 
larly, secularly trended input prices may in turn be attributed to a 
constant rate of pure technical change leading to biased estimates of 
the ef fects of nonneutral technical change (X <j>, In P,). 

B. Incorporation of a General Index of Technical Change 

The motivation for our approach is Solow's index of technical change 
A(t), except in our specification A(t) can be both nonneutral and scale 
augmenting. Our approach involves altogether replacing T and T 2 
with a purely general index of technical change, ,4(0. as follows: 

In C = a„ + 2 A*D* + A(t) + Sa, In P, + y In (7 

+ '/ill Py In P, In P, + wy*(ln Qf (8) 

+ I <M(')ln P, + 2 4», In P, in Q + 0d(t)ln Q. 

The corresponding cost shares are 

S, = flnT~ = a ‘ + 2 In P i + <M('> + ln Of 

J (9) 

i = 1. n. 

Estimation of equations (8) and (9) would be a simple matter if only 
A(t) were observable. However, utilizing dummy variables and a 
pooled data set, 7 we can estimate equations (8) and (9) as 

B For a nice discussion of measurement error under alternative approaches, see 
Diewert (1981). 

7 Casual inspection might suggest that Caves, Christensen, and Swanson (1981) de¬ 
veloped a similar procedure since they used dummies to account for differences be¬ 
tween cross sections in 1955, 1963, and 1974. Their use of dummy variables was purely 
ad hex: and did not lead to a general index of technical change. In our formulation, the 
same A(() index interacts with factor prices and output. 
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In C » X X*D* + 2 *lA + ‘^ X X P«> ln ln P j + ****0" 0* 

A t i j 

+ X ln Pi ln Q + X X a *A In P, + 2 0*D, ln Q, 

i i t 


( 10 ) 


where D, is a time-specific dummy (t = 1, .... 7') and D* is a firm- 
specific dummy (k = 2,... ,m). The corresponding share equation is 

S, = X a *A + X Pv ,n + ^ ln Q’ i ~ 1 .. (11) 

* > 

Equations (10) and (11) are identical to equations (8) and (9) if and 
only if 


“9/ = ot 0 + A(/), 

(12a) 

a* = «. + <M(0. 

(12b) 

0,* = y + 0A(/). 

(12c) 


Our estimates of A(t) can be derived by imposing the restrictions in 
equations (12a)-(12c) on the system of equations in (10) and (11). K 
Furthermore, taking the initial year as the base year for A(l) (i.e., A(l) 
= 0) allows us to identify a () , a„ y, and as well as the index A(t). 

On the lace of it, it would appear that the general technology index 
model greatly increases the number of parameters; however, the re¬ 
sult is a system of equations with only T - 3 parameters in excess of 
the standard time trend model. As in many industries, the number of 
firms is likely to be large relative to T, thereby justifying reliance on 
large-sample properties for estimation purposes. 

To reiterate, important differences in the characterization of tech¬ 
nical change exist in the two models. Analogous to equation (6), tech¬ 
nical change in the general technical index model becomes 


T = A(t) - A(t - 1) + X <W-*(0 ~ A(t - l)]ln P, 

* 

+ 0[A(t) - A(t - l)]ln Q. 


(13) 


In turn, technical change can be decomposed into the billowing 
three components: (1) the effects of pure technical change, A(t) 

- A{t — 1); (2) the effects of nonneutral technical change, 2 4>,[A(0 

- A(t - l))ln P,\ and (3) the effects of scale augmentation. 0{A(l) - 
A(/ - l)]ln Q. Note that year-to-year changes in the estimates of A(f) 
may even provide a very erratic pattern of technical change, reflecting 


8 This can be easily implemented using the SYSNL1N procedure in SAS. ln particu¬ 
lar, we used the nonlinear iterated seemingly unrelated regression procedure (see 
Gallant and Jorgenson 1979). 
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the effects of technological epochs as in Solow’s index. Furthermore, 
changes in A(t ) reflecting the underlying pattern of pure technical 
change also reflect the extent to which factor bias and scale augmen¬ 
tation affect overall technical change. For example, if A(t) is un¬ 
changed, factor prices or output will have no effect on the overall rate 
of technical change. This behavior is in sharp contrast to the standard 
time trend model in which, for constant prices and output, there 
would still be an impact on technical change. 

IV. A Comparison of Results 

A. Description of Data and Estimation Procedures 

Our data base was provided by Nelson and Wohar (1983) and con¬ 
sisted of annua) time-series data for 30 electric utilities for the years 
19f>l-78. The choice of electric utilities is particularly germane given 
the numerous studies of productivity in this industry. 0 The choice of 
electric utilities raises the question whether regulatory bias affects the 
rate of technical change, substitution relationships, and so forth, 
necessitating a more elaborate model. Our reading of the existing 
literature is that the case for inclusion of Averch-Johnson effects is 
hardly dispositive. 10 Instead, following Joskow’s (1974) view of the 
regulatory process, we view regulatory tightness as a possible determi¬ 
nant of the rate of technical progress in Section V. 

With a data set consisting of 840 observations, estimation of both 
cost functions utilized iterative Zellner efficient estimation on a system 
of two cost shares (labor and capital with the fuel share excluded) plus 
the total cost function. The usual symmetry conditions (p y = (3 ; ,) and 
adding-up restrictions were imposed. 11 

The parameter estimates for the two com|>eting models are given in 
appendix table Al. 12 The standard tests for a well-behaved cost func¬ 
tion were disappointing. Symmetry was rejected for both models. 13 

9 To name just a few, see Barrel (1963), Gollop and Jorgenson (1980), Gollop and 
Roberts (1981), Nelson and Wohar (1983), and Atkinson and Halvorscn (1984). 

10 See, e.g.. Joskow (1974). Furthermore, the numerical estimates in Nelson and 
Wohar (1988) generally provide little empirical support for their importance. An alter¬ 
native view is set forth in Atkinson and llalvorsen (1984). 

" Given that nonlinear estimation can be sensitive to the initial slatting values, sen¬ 
sitivity analysis was performed, indicating that our results were robust to alternative 
starling values. 

12 Note that for parameters common to the two models, the coefficients arc quite 
similar. An apparent anomaly is 4>* and 4>/., which differ because A(() and T are not 
scaled similarly, In fact, the former is generally decreasing while T is increasing, ex¬ 
plaining the opposite signs. 

,s For the time trend model - 2 In LJl. , = 114.7. This is distributed as chi-squared 
with three degrees of freedom (xi, m = 7.8). 
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Additional tests for concavity in input prices revealed a negative 
semidefinite Hessian matrix when evaluated at the sample means; 
however, concavity violations were observed for 97 and 83 of the 840 
observations in the standard time trend model and the general tech¬ 
nology index model, respectively. It is interesting to note that most of 
these violations occurred in the 1975-78 period, during which oil 
price shocks and environmental controls affected utilities. While this 
paper does not follow the tack of estimating the short-run cost f unc¬ 
tion, such an approach would appear appealing, especially to the 
extent that short-run disequilibria were operative over these years. 

B. Estimates of Total Factor Productivity 

As a first step, we contrast the ability of our two competing models to 
explain the observed growth in total factor productivity. Initially, a 
Tornqvist approximation to a Divisia index of factor productivity is 
computed from observed cost, input price, and output data as in 
equation (2): 

TFP, = TFP,_,(I + TFP), TFP|., r>1 = 1. (14) 

Similarly, utilizing the parameter estimates for each of the two mod¬ 
els, we computed TFP and derived an econometrically estimated in¬ 
dex of total factor productivity. 14 After computing such indexes for 
the 30 electric utilities, we computed an industry aggregate index 
using varying output weights over time (see fig. 1). Figure 1 contrasts 
the observed industry aggregate with the indexes implied by the stan¬ 
dard time trend model and the general technical index model. The 
observed index of industry total factor productivity suggests two tech¬ 
nological epochs. Up until 1969, when the index reaches 157, there is 
a pattern of uneven but appreciable productivity growth, averaging 
2.4 percent annually. For the period 1969-78, the total factor pro¬ 
ductivity index shows no appreciable trend. With the recession follow¬ 
ing the Arab oil embargo, total factor productivity plummets in 1975 
from 157.9 to 139.5. The following year, the industry recovers from 
the first energy price shock as productivity returns almost to the 1974 
level but experiences very little growth thereafter. 

The standard time trend model yields a total factor productivity 
index that differs significantly from the Divisia index in several re¬ 
spects. First, it exhibits a smooth pattern without the year-to-year 
fluctuations in 1974-76, for example. Second, the tendency is for 
predicted total factor productivity to grow at a faster rate for the early 


M Computation of the index using a continuous translog cost function at two discrete 
points in time was done using the quadratic approximation lemma (see I)iewert 1976). 
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Tic. I.— Total productivity index 

period 1951-58. Then (or the period 1959-69, the standard time 
trend model exhibits a generally slower growth in productivity. Fi¬ 
nally, for the period 1974-78, predicted total factor productivity ex¬ 
hibits a declining pattern. 

The general index model closely tracks observed total factor pro¬ 
ductivity, rising appreciably in 1955, 1959, 1966, and 1976 and falling 
sharply in 1975. In terms of root mean square forecast error, the 
general index model’s error is 2.57, as compared to 7.01 in the stan¬ 
dard time trend model. 

C. Decomposition of Total Factor Productivity into 
Technical Change and Scale Effects 

What do the two competing models imply about the relative impor¬ 
tance of technical change and scale effects? 'Fable 1 compares, for 
selected time intervals, the predicted average growth in total factor 
productivity and its decomposition into technical change and scale 
economies. For each model, firm-level estimates of productivity 
growth were developed and then weighted as a share of industry 
output to produce industry-level estimates of total factor productivity 
and its decomposition into technical change and scale economies ef¬ 
fects. Table 1 shows that the dissimilar estimates of productivity 
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growth are due primarily to differences in the estimates of technical 
change. Both models suggest that the effects of scale economies are 
rather small compared with technical change as a determinant of 
productivity growth. Reference to the parameter estimates for the 
scale coefficients (y and y* in app. table Al) shows that in both models 
the long-run average cost curve declines over the range of firms with 
low' output, then becomes Hat, and rises slightly for high-output firms. 
Consequently, for the lower outputs observed in the 1950s, scale ef¬ 
fects accounted for about 20 percent of total factor productivity 
grow'th. By the 1970s, most firms appear to have been operating in 
the constant cost range, explaining the declining relative importance 
of scale effects. Thus diminished scale effects appear to have played a 
small part in the decline of productivity growth since 1968. 

On the face of it, the finding that scale effects played a relatively 
small role compared with technical change seems contrary to a long 
tradition of individual plant studies emphasizing the importance of 
scale economies and their potential for stimulating productivity 
growth (see, e.g., Nerlove 1963; Christensen and Greene 1976). It is 
important to remember that our sample is at the firm level, where 
scale effects are probably much less important than at the plant 
level. 15 Our results accord closely with Nelson and Wohar’s (1983) 
estimate of € (: q of .945 for the period 1950-73, during which they 
estimated that scale economies accounted for only 13.6 percent of the 
growth in productivity. 

D. Decomposition of Technical Change 

As noted earlier, technical change can in turn lx? decomposed into the 
effects due to pure technical change, nonneutral technical change, 
and scale-augmenting technical change, as shown in equation (6) for 
the standard time trend model and in equation (13) for the general 
technical index model. Do these two competing representations yield 
different implications about the relative role of these sources of tech¬ 
nical change? Table 2 attempts to answ r er this question. 

Both models find that technical change has been nonneutral of an 
energy-using nature, 16 corroborating Stevenson’s (1980), Gollop and 
Roberts’s (1981), and Jorgenson and Fraumeni’s (1981) findings. 
Nevertheless, the results here suggest that rising energy prices in the 
1970s played a rather modest direct role in both models. Also, techni- 


15 For a discussion of firm vs. plant scale ef fects, see Nerlove (1963). 

18 From app. table Al, it would appear that the sign of 4> f (= -[<f>* + 4>/-J) •* 
conflicting in the two models. In fact they are implying a similar result because A(t) is 
generally declining over time while T is increasing. 
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cal change does not appear to have significantly affected economic 
scale since the coefficient 0 in both models is small and not statistically 
significant. Thus scale-augmenting technical change can be consid¬ 
ered unimportant. 

Even though energy-using technical change tends to reduce the 
reported rate of technical change in the 1970s, pure technical change 
is the primary component directing a pattern of overall technical 
change in the 1950s, 1960s, and even the 1970s. On average, for the 
period 1952-78, pure technical change accounts for 96.5 percent of 
technical change in the standard time trend model and 97.5 percent 
in the general technical index model. Even though both models give 
different year-to-year changes in technical change, it is assuring that 
the relative importance of pure, nonneutral, and scale-augmenting 
technical change does not differ greatly in the two models. 

V. Implications of the Two Competing Indexes 

There are important differences in the two indexes, both in the visual 
description of technical change and in their usefulness for analyzing 
the causes of technical change. From figure 1 and table 2, it is appar¬ 
ent that the two measures give quite different measures of technical 
change. But which measure advances our understanding of the 
causes behind these visual patterns? The time trend model implies 
that technical change may fluctuate randomly about the time trend 
index, but the underlying cause of technical change is the mere pas¬ 
sage of time. In figure 1, the quadratic time trend leads to a peak in 
technical change in 1974, followed by a period of negative technical 
change. But even if the quadratic term were omitted to avoid the 
implications of technical regress occurring in the future, it is apparent 
from figure 1 that a simple time trend cannot simultaneously describe 
a period of rapid technical change (1950s and 1960s) and a period of 
stagnant technical change (the 1970s). 

Whereas the time trend model yields a smooth index driven by the 
passage of time, the general technical index makes no a priori state¬ 
ment about the causes of technical change. However, it offers an 
economically rich series that can then serve as the basis for analyzing 
the causes of technical change (see, e.g., Solow 1957; Barzel 1963; 
Kendrick 1973). To illustrate, we perform a simple econometric anal¬ 
ysis to explain the determinants of technical change. On the basis of 
previous research, we postulate that there are four determinants of 
technical change in electric utilities. First, since technical change in 
Solow’s vernacular includes any shift in the production structure, it is 
logical to include capacity utilization (CU) to reflect short-run dis- 
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equilibria. 17 Second, sulphur oxide regulations imposed in the 1970s 
are likely to have impeded productivity growth, as shown in Gollop 
and Roberts (1983). Unlike their measure, based on detailed plant-by¬ 
plant emission constraints, we use as a measure of sulphur oxide 
regulations the estimated sulphur oxides emitted per kilowatt hour 
(SOKWH) under the assumption that, after 1970, regulatory con¬ 
straints were binding.Third, Nelson (1984) provided persuasive 
evidence that technical change is embodied in the vintage of capital 
equipment (VINT). 19 Fourth, following Joskow (1974), we posit that 
the tightness of regulation may affect the intensity with which costs 
are minimized. Accordingly, we include as a measure of regulatory 
tightness the weighted-average rate of return for the 30 utilities rela¬ 
tive to the AAA bond rate for utilities (REGT). Equation (15) shows 
the result of a linear regression utilizing capacity utilization (CU), SC).j 
restrictions (SOKWH), the average year of capital installation 
(VINT), and regulatory tightness (REGT) regressed against the gen¬ 
eral index of technical change (/*): 

/* = -1.865 + .665CU + 29.90SOKWH 
(3.5) (2.0) (3.9) 

+ .0188VI NT - .098REGT, (15) 

(3.8) (1.5) 

= .898, SE = .046, D-W « 1.53. 

The results in equation (15) all have the correct signs and. with the 
exception of REGT, are statistically significant. I’he magnitude of the 
coefficients seems reasonable on the basis of a computation presented 
in table 3 that utilizes equation (15) to forecast changes in the techni¬ 
cal index for the periods 1951—60, 1961-70, and 1971-78. Table 3 
asks how much of the change in the technical change index over each 
subperiod can be explained by capacity utilization, vintage effects, 
environmental controls, and regulatory tightness. 

Table 3 offers some very insightful results. First, the biggest factor 
leading to the decline in annual technical change appears to be sul¬ 
phur oxide restrictions put in place in the 1970s. This finding echoes 

17 For examples in which capacity utilization was entered directly, see Stevenson 
(1980) and Cowing, Small, and Stevenson (1981). 

IH For our purposes, we used SO* emissions measured in pounds per kilowatt hour. 
The SO* emissions are available from the Environmental Protection Agency. Prior to 
1970, SO* restrictions were assumed to be nonbinding, so SO* per kilowatt hour was 
held constant at the estimated 1970 rate. For a detailed construction of a superior 
environmental measure, see Gollop and Roberts (1983). 

151 Nelson has kindly provided the average vintage index (VINT) based on an arith¬ 
metic average of the year of installation for 44 major electric utilities. 
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TABLE 3 

Explanation of Average Annual Technical Chance 



Percentage 
Growth in 

'1 ECHN1C.A1 

Change 


Contribution ok 


Capacity 

Utilization 

Vintage 

Effects 

SO. 

Restrictions 

Regulatory 

Tightness 

1951-60 

2.93 

- .57 

2.19 

0 

.74 

1961-70 

2.04 

.44 

1.44 

0 

.57 

1971-78 

- 1.03 

-.83 

1.62 

-2.61 

.10 


the earlier results of Gollop and Roberts (1983). Second, our results 
support Nelson’s contention that technical change is very much tied to 
vintage capital effects. During the 1950s, the stock of generation ca¬ 
pacity grew rapidly, and this was the major driver of the high rates of 
technical change in the 1950s. Since then, the capital stock has ex¬ 
panded less rapidly, contributing to a reduction in technical change. 
Third, the effects of regulatory tightness are much smaller in relative 
magnitude and only significant under a one-tailed test at the 10 per¬ 
cent significance level. 20 Nevertheless, this measure indicates a rather 
plausible steady tightening of regulation constraints in the 1950s and 
1960s with corresponding cost reductions. After 1970, this measure 
depicts an already tightly constrained regulatory framework with lit¬ 
tle role for additional cost reductions. Fourth, capacity utilization fluc¬ 
tuates much greater than one would expect because of variations in 
the business cycle. There has been a secular decline in capacity utiliza¬ 
tion from 57.0 percent in 1951 to 45.8 percent in 1978, owing to the 
tendency for peak demand to grow relative to base-load demand, due 
primarily to air conditioning. With lower utilization of the capital 
stock, it is not surprising that productivity would decline. Inter¬ 
estingly, the 1960s represented an aberration in this secular trend: 
capacity utilization increased as rapid demand growth in the late 
1960s combined with slower capacity expansion. Consequently, the 
index of technical change did not fall much in the 1960s vis-a-vis the 
1950s owing largely to an artificially increasing rate of capacity utiliza¬ 
tion. When capacity utilization resumed its downward trend in the 
1970s, this, together with sulphur oxide restrictions, tended to pro¬ 
duce an abrupt drop in technical change. Thus it would appear that 


110 The regression omitting regulatory tightness yields similar results. For example, 
the percentage contributions in table 3 become, respectively, - 0.67, 2.98,0 (1951-60): 
0.51,1.96, 0(1961-70); and -0.97, 2.20, and - 2.87 (1971-78). Note thatthe regula¬ 
tory tightness variable appears absorbed primarily in the vintage capital effects. 
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the productivity puzzle of the 1970s in electric utilities has some com- 
monsense explanations, based in part on sulphur oxide restrictions 
and rising peak- versus base-load demand leading to lower capacity 
utilization. 

VI. Summary and Conclusions 

Applied researchers have been faced with an awkward choice: (1) 
utilize exact indexes to compute a general index of technical change, 
predicated on restrictive assumptions regarding the underlying tech¬ 
nology; or (2) utilize a time trend representation of technical change 
in a more general model of production technology. This paper offers 
a method for simultaneously allowing for a general index of disem¬ 
bodied technical change within the context of a quite general produc¬ 
tion technology. Our technique requires a pooled data set for firms 
within the same industry and utilizes dummy variables to estimate 
econometrically a purely general index of technical change. A{t). As 
pooled data sets for firms in the same industry become increasingly 
available, our procedure can find increased applications.' 1 

Our example for the electric utility industry for the period 1^51 — 
78 reveals some interesting findings. First, compared with the simple 
Tornqvist approximation to the Divisia index, the general technical 
index model produces a close approximation to the pattern of ob¬ 
served changes in total factor productivity (table 1). Furthermore, the 
general technical index model enables separation of scale effects from 
technical effects and the decomposition of the latter into pure, non¬ 
neutral, and scale-augmenting technical change. Second, compared 
with the standard time trend model, the major difference in the two 
models is the measure of pure technical change, which for the stan¬ 
dard time trend model is 8 + 8*T as contrasted with A(t) - A(l 
- 1) for the general index model. The latter index contains an eco¬ 
nomically rich pattern of variation useful in subsequent analyses of 
the determinants of technical change, whereas the standard time 
trend index tells us very little about technical change and forms little 
basis for further analyses. A simple regression of the general techni¬ 
cal index on industrywide measures of capacity utilization, environ¬ 
mental restrictions, vintage capital effects, and regulatory tightness 
suggests that the productivity decline of the 1970s can be attributed 
primarily to sulphur oxide restrictions and secularly declining capac¬ 
ity utilization due to rapidly increasing peak-load demands. 


41 h is particularly encouraging that the Bureau of the Census is developing a Longi¬ 
tudinal Establishment Data hie, covering 350,000 establishments since 1972. 
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TABLE A1 

Parameter Estimates of Standard Time Trend and General 
Technical Index Models 



Estimate 

Standard 

Error 

/-Statistic 

Estimate 

Standard 

Error 

/-Statistic 

a, , 

.318 

.037 

8.7 

.283 

.033 

8.7 

“A 

.297 

.005 

57.9 

.273 

.006 

43.3 

<*/. 

.142 

.003 

40.8 

.147 

.004 

34.9 

y 

.930 

.022 

41.9 

.964 

.019 

49.5 

y* 

038 

.0116 

3.8 

.050 

.010 

4.9 

Paa 

.112 

.005 

21?. 1 

.102 

.006 

17.9 

Pa/ 

- .016 

.004 

4.1 

- .004 

.004 

1.0 

P/7 

.058 

.004 

14.0 

.0150 

.004 

15.5 

<t>A 

- .003 

.0003 

11.4 

.094 

.017 

5.5 

4.,. 

-- .002 

.0002 

12.8 

.144 

.013 

10.8 

<I<A- 

.013 

.002 

6.4 

-.018 

.002 

8.6 

4*/ 

- .004 

.001 

3.4 

- .005 

.001 

3.7 

H 

- .001 

.0008 

1.4 

.034 

.039 

.9 

5 

- .045 

.002 

18.9 



• . . 

S* 

.002 

.0001 

17.5 





Dummy Variable Coefficients 


1)2 

-.190 

.027 

-7.1 

-. 155 

.024 

- 6.6 

D3 

- .039 

.024 

- 1.6 

- .048 

.021 

- 2.3 

D4 

-.189 

.022 

- 8.5 

-.168 

.019 

- 8.6 

05 

- .057 

.032 

-1.8 

-.109 

.027 

- 4.0 

D6 

.139 

.036 

3.9 

.070 

.030 

2.3 

D7 

.003 

.023 

.1 

.020 

.020 

1.0 

D8 

.007 

.030 

2 

- .035 

.026 

- 1.4 

1)9 

- .286 

.022 

-13.2 

- .268 

.019 

- 14.3 

D10 

-.169 

.028 

-6.0 

-.232 

.024 

- 9.5 

1)1! 

- .077 

.029 

-2.6 

-.146 

.025 

-5.8 

D12 

- .083 

.025 

-3.4 

-.114 

.021 

-5.3 

D13 

-.231 

.025 

-9.1 

- .271 

.022 

- 12.3 

D14 

-.103 

.029 

- 3.5 

-.158 

.025 

-6.3 

D15 

-.217 

.022 

-9.7 

-.218 

.020 

-11.2 

Dlti 

-.166 

.026 

-6.5 

-.191 

.022 

- 8.6 

DI7 

- .034 

.035 

-1.0 

-.110 

.030 

-3.7 

DI8 

-.134 

.027 

-4.9 

-.184 

.024 

- 7.8 

D19 

-.085 

.025 

-3.4 

-.122 

.022 

- 5.5 

D20 

-.165 

.027 

-6.1 

-.213 

.023 

-9.1 

D21 

-.136 

.022 

-6.2 

-.128 

.019 

-6.8 

D22 

-.262 

.022 

- 12.0 

- .260 

.019 

- 13.7 

D23 

-.320 

.029 

-10.9 

-.372 

.025 

- 14.7 

D24 

-.129 

.024 

-5.3 

-.114 

.021 

-5.4 

D25 

- .260 

.022 

- 11.8 

- .267 

.019 

- 14.0 

D26 

-.014 

.023 

-.6 

-.020 

.020 

- 1.0 

D27 

- .045 

.023 

-2.0 

- .059 

TIP M 


D28 

-.183 

.022 

-8.4 

-.181 



D29 

-.032 

.033 

-1.0 

-.098 



D30 

-.147 

.022 

-6.6 

-.152 


Uni 

A<2) 




- .035 


1.8 
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TABLE Al ( Continued ) 



Estimate 

Standard 

Error (-Statistic Estimate 

Standard 

Error /-Statistic 

8(») 

-.067 

.020 

3.3 

8(4) 

- .067 

.020 

3.3 

8(5) 

-.117 

.020 

5.6 

8(6) 

-.137 

.021 

6.4 

8(7) 

-.184 

.022 

8.3 

8(8) 

-.198 

.022 

8.9 

8(9) 

- .254 

.023 

10.9 

8(10) 

-.264 

.024 

11.1 

8(11) 

-.267 

.024 

11.2 

8(12) 

-.267 

.024 

11.1 

8(13) 

-.298 

.025 

12.1 

8(14) 

-.311 

.025 

12.4 

8(15) 

- .320 

.025 

12.6 

8(16) 

- .368 

.026 

14.0 

8(17) 

- .388 

.026 

14.7 

8(18) 

-.420 

.027 

15.5 

8(19) 

- .465 

.027 

16.9 

A(20) 

-.468 

.028 

16.9 

8(21) 

- .429 

.028 

15.4 

8(22) 

- .426 

.028 

15.1 

8(23) 

- .459 

.029 

16.0 

8(24) 

- .429 

.029 

14.8 

8(25) 

- 244 

.029 

8.5 

8(26) 

- .405 

.030 

13.7 

8(27) 

- .398 

.030 

13.2 

8(28) 

- .386 

.030 

12.9 
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Employment Contracts, Influence Activities, 
and Efficient Organization Design 


Paul R. Milgrom 

Stanford University 


When changing jobs is costly, efficient employment contracts usually 
fail to compensate workers for the effects of posthiring events and 
decisions. Then, when there are executives and managers with au¬ 
thority to make discretionary decisions, affected employees will be 
led to waste valuable time trying to influence their decisions. F.ffi- 
cient organization design counters this tendency by limiting the dis¬ 
cretion of decision makers, especially for those decisions that have 
large distributional consequences but that are otherwise of little con¬ 
sequence to the organization. 


The inference to which we are brought is that the causes 
of faction cannot be removed, and that relief is only to 
be sought in the means of controlling its effects. [Jamf.s 
Madison, The Federalist j 


I. Introduction 

Experience suggests—and most Western economists believe—that 
decentralized economic authority such as that found in market econo¬ 
mies encourages innovation and promotes efficient resource use. The 

1 owe a special debt of thanks to John Roberts for allowing me to incorporate the 
fruits of our joint research into this paper, for suggesting some of the reported applica¬ 
tions and references, and for his thorough and constructive commentary on the previ¬ 
ous versions. For their helpful comments, I also thank Drew Fudenberg, Bengt llolm- 
strom, Roger Kormendi, Ed Lazear, Maggie Lcvenstein, Rick Levin, Meg Myer, Jim 
Mirrlees, Charlie Plott, Ariel Rubinstein, and Jean Tirole, my research assistants 
Byung-Il Choi and Bernard Desgagne, and two anonymous referees. I gratefully ac¬ 
knowledge the financial support of the National Science Foundation, the John Simon 
Guggenheim foundation, the Sloan Foundation, and die University of California, 
Berkeley. 

[ Journal of PulUical Economy , 1988, vol. 96, no, 1] 

© 1988 by 'live Univeisfoy of Chicago. All rights reserved. 0022-3808/88/960 l-0007$0l 50 


4 * 





EMPLOYMENT CONTRACTS 


43 


reasons for these advantages, however, have proved difficult to pin¬ 
point. Why can’t a centrally planned, socialist economy mimic a de¬ 
centralized one whenever that is desirable? Coase (1937) posed the 
corresponding question for private ownership economies: “Why is 
not all production carried out by one big firm?" 

I shall argue that there are costs, called “influence costs,” that at¬ 
tend any increase in centralized control, whether in a firm or in a 
larger economic system. These costs arise because participants inevi¬ 
tably care about the decisions that the central authority can make and 
so tend to spend too much time trying to influence the authority’s 
decisions. That time, of course, is valuable; if it were not wasted on 
influence activities, it could be used for directly productive activities 
or simply consumed as leisure. 

The fact that centralization entails costs does not mean that cen¬ 
tralizing decision authority is never desirable. Central planning and 
decision making may improve coordination among the diverse actors 
in an economic system enough to make bearing the attendant in¬ 
fluence costs worthwhile. However, when the potential benefits of 
central control are slight and the influence costs are great, the discre¬ 
tion of the central authority should be restricted. Since influence costs 
tend to be greater when the members of the organization have larger 
stakes in the decision to be made, efficiently designed organizations 
limit the discretion of decision makers in those matters that are of 
little direct importance to the organization (in terms of the potential 
for improved decision making to advance the organization’s objec¬ 
tives) but of great importance to individual organization members. 

Although the foregoing themes appear to be general ones, I shall 
limit the formal analysis of them to the important special case in 
which the organization is a profit-maximizing firm and the interested 
parties are the firm’s employees. This focus forces one to face certain 
issues squarely. First, why do employees care about the decisions 
made by their employers? Under the traditional spot contracting 
equilibrium theory of the labor market, prevailing wages always leave 
each employee just indifferent between his current (best) job and his 
next-best job alternative. According to that theory, jobs that are un¬ 
pleasant or dangerous pay higher wages than those with more desir¬ 
able characteristics. In practice, employers do pay some compensating 
differentials: Premium wages hate often been paid for hazardous 
duty, overseas assignments, and late-night shifts. Why aren't these 
practices even more extensive, fully compensating all employees for 
all variations in job characteristics? Are the uncompensated job char¬ 
acteristics found in practice just an unimportant residual? These 
questions are of central importance for the theory, for if wages did 
fully compensate employees for all variations in job characteristics. 
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then employees would have no interest in influencing employer deci¬ 
sions. 

1 have no evidence to offer concerning the magnitudes of the fail¬ 
ure of compensating differential theories, though it is clear from 
casual observation that for salaried workers pay is normally adjusted 
only for substantial and long-lived changes in job attributes. The cost 
of writing detailed contracts provides one partial explanation for this 
incompleteness of compensating differentials. 

In Section II, I offer some alternative explanations. The first is 
based on an optimal contracting model in which the wage paid can 
depend on all the attributes of a worker’s assignment; the assignment 
itself is assumed to be determined only after the worker is hired. I 
assume that there are some restrictions on worker mobility, such as 
relocation or training costs, that f ree the employer from the absolute 
need to compensate employees fully for every variation in their work 
environment. Still, under the terms of an optimal contract, risk- 
neutral employers always insure risk-averse employees against in¬ 
come fluctuations, and one might guess that employers would also 
insure employees against other sources of fluctuations in their wel¬ 
fare. Such a guess would be far off the mark. For example, with any 
Cobb-Douglas specification of ordinal preferences over working con¬ 
ditions and wages, an optimal contract can specif y that higher wages be 
paid to employees enjoying better working conditions! More generally, 
when employees care about both working conditions and consump¬ 
tion and provided that consumption is a normal good, employees will 
prefer assignments with good working conditions because under an 
optimal contract poorer working conditions are not fully compen¬ 
sated by higher wages. The magnitudes of these effects depend on 
employee risk aversion: As risk aversion increases, the optimal wage 
schedule is transformed toward one with fully compensating differ¬ 
entials. 

Two additional contracting models are also analyzed in Section II. 
In these models, unlike the one just discussed, job characteristics mat¬ 
ter to employees only to the extent that they affect income. In each 
model, I compute the optimal contract and then study the income 
streams attached to different assignments. In the first, employees are 
found to prefer assignments that build their human capital because 
these raise future wages with no offsetting current wage reduction. In the 
second, employees are found to prefer “critical” jobs—defined as 
those for which quits are especially costly to the employer—because 
these jobs pay higher wages. In all three models, employees care 
about events that occur after the date of hiring. And, in ail three, an 
employee’s ranking of these events bears no necessary relation to the 
ranking based on employer net profits. 
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Influence activities and the optimal limitation of executive and 
management discretion are analyzed in Section III by means of a 
model in which employees allocate their time between influence activ¬ 
ities and some directly productive activity. Although the firm can use 
its compensation policy to alleviate influence costs, it will sometimes 
prefer to restrict the decision maker’s discretion instead. There are 
two key parameters in the model that are used to characterize when 
the discretion of management should be restricted. The first parame¬ 
ter measures the importance of the decision to the organization; it is 
essentially the excess of the expected payoff from making an in¬ 
formed decision over that from holding unconditionally to the status 
quo. The second measures the redistributive potential of a change; it 
is the utility that would be transferred from one employee to another 
if a change from the status quo were authorized without any compen¬ 
sating wage adjustment. In an efficiently designed organization, man¬ 
agement will be allowed no discretion over those decisions that are of 
little importance to the organization but that have potentially large 
redistributive consequences. 

As an illustration of efficient design, consider American Airlines' 
procedure for assigning flight attendants to routes. Once a month, 
flight attendants bid for the routes they prefer, with conflicts resolved 
on the basis of seniority. 1 Management exercises no discretion over 
the assignment decision. This is perfectly appropriate: The airline 
cares little about which attendants are assigned to which routes, but 
the flight attendants care a great deal. American Airlines’ practice, 
like many standard operating procedures, can be understood as an 
attempt to avoid the influence activities that would result if manage¬ 
ment exercised discretion in assigning flight attendants to routes. 

Rosenberg and Birdzell (1986) have emphasized the historical im¬ 
portance for Western economic growth of the “immunity [of in¬ 
novators] from interference by the formidable social forces opposed 
to change, growth, and innovation” (p. 24). In terms of the theory 
presented here, the social costs of an incorrect decision to allow ex¬ 
perimentation with, say, a new steel-making process or a new sailing 
ship design were small compared with the potential redistributive 
consequences of a successful innovation. Thus it was wise or lucky 
that Western governments established no agencies with authority to 
review and reject proposed innovations. In contrast to the continuous 
commercial and industrial development in the West since the Middle 
Ages, Chinese development under the tight control of its powerful 

1 For international flights, some positions are reserved for suitably multilingual flight 
attendants, and only those with certified fluency arc permitted to bid for those posi¬ 
tions. 
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scholar-bureaucrats was slow, despite the advanced state of China’s 
science and the capital accumulations of its merchants. 2 

A brief review of some related theoretical literature is given in 
Section IV, applications are suggested in Section V, and concluding 
remarks are offered in Section VI. 


II. Why Full Compensating Differentials 
Are Not Paid 

Following Coase (1937) and Simon (1951), let us suppose that at the 
time of contracting neither the employer nor the employee knows 
precisely what conditions will prevail at the time that work must actu¬ 
ally be performed. In an academic job market, a new professor may 
not know who his colleagues will be, which courses he will teach, what 
his committee and administrative responsibilities will be, which office 
and secretary will be assigned to him, who his research assistant will 
be, and so forth. These characteristics of the job, to be determined 
after the employment relation begins, will be denoted by x. The em¬ 
ployment contract specifies a wage that may be a function of the 
undetermined characteristics: w = w(x). 

To build a simple formal model of this situation, assume that the 
possible circumstances {*j, ..., } and their probabilities {p, p N \ 

are given exogenously. Let xv, denote the wage paid in circumstances 
x,. Suppose that the employee’s preferences are given by the von 
Neumann-Morgenstern utility function u(x, re). For brevity, let us 
write u,{w) for u(x„ w). Assume that each u, is twice continuously 
differentiable with it,' > 0. The employer is a risk-neutral expected 
net profit maximizer; it receives revenues of it, in event x,. Suppose 
that, at the time of contracting, labor market conditions require the 
employer to offer the agent an expected utility of at least u. Further 
suppose that the employee, after signing the contract and learning 
that the job is x, will quit and reenter the labor market unless u(x, n/(x)) 
is at least some reservation level u < u, where u - u reflects mobility 
costs. The employer, however, is assumed always to be bound by the 
contract. An efficient contract, subject to the employee’s “no quitting” 
constraint, solves 


,v 

maximize X pA*i - w,) (t-P) 

i 


2 Needham (1969, p. 197) holds that “capital accumulations in Chinese society there 
could indeed be, but the application of it in permanently productive industrial enter¬ 
prises was constantly inhibited by the scholar-bureaucrats, as indeed was any other 
social action which might threaten their supremacy." 



EMPLOYMENT CONTRACTS 


47 


subject to 

^ piUi(W t ) a u, 

* 

Uj(vjj) a d for all j = 1,.... N. 

Let us consider a family of problems like (CP), parameterized by u. 
Take d to be any function of 5 such that u(u) is always less than u. 
When does the optimal contract pay full compensating differentials, 
leaving the employee indifferent among posthiring events? 

Theorem 1. A solution to (CP) exists and makes the employee 
indifferent among outcomes and just willing to work ( u,(w*) = u) for 
every u in the range of u i if and only if u i is concave and for all i there 
exists g, such that 

u,(u>) = U|(u> + g,) for all w. (1) 

Proof. It is routine to check that (1) and the concavity of uj imply 
that the optimal contract exists and satisfies u,(w?) = S; attention is 
focused on the reverse implication. Regarding w* as a function of u, 
the hypothesis is that u,(w*(u)) = u for all i and all u in the range of u j, 
that is, w* - u, *. The first-order necessary conditions for optimality 
in (CP) imply that, for all t and all u in the range of u 1( 

«!«(«)) = ( 2 ) 

Then w*'(S) = w*'(u). Hence, w*(u) * u>*(u) + g, for all u in the range 
of u\, where the g,'s are constants of integration. Then, for any fixed 
w,Ui(«! + g,) = U,[u-*(u,(w)) + g,] = U)(u'*(u,(«'))] = Ui(ui). This holds 
for all w, as required. 

Given the identity just derived, the second-order necessary condi¬ 
tions imply that u"(w*(u)) s. 0 for all fi, which establishes concavity. 
Q.E.D. 

An optimal contract equates a risk-averse employee’s marginal util¬ 
ity of income in the different events x; it does not also equate his 
utilities in the different events unless the employee has ordinal pref¬ 
erences that can be represented by vertically parallel indifference 
curves in (x, w)-space. This characterization of ordinal preferences is 
quite restrictive. When it fails, the optimal contract will not leave the 
employee indifferent among assignments. 

Now let us make an obvious but quite important observation: At the 
optimal contract, the employee's wages u>* do not depend at ail on the 
gross profit levels it,. Consequently, there is no necessary relationship 
between the employer’s ranking of outcomes and the employee's. Later, when 
the possibility that the employee can influence the distribution of x is 
introduced, this divergence of rankings will become quite important. 



4B journal of political economy 

Theorem I is just a starting point. It tells us that full compensating 
differentials are rarely paid in a large class of contracting models. The 
remainder of this section is devoted to the development of examples 
to illustrate the following points: (1) There is not even a general 
tendency for optimal wage schedules to compensate for job characteris¬ 
tics. so that employee job concerns under optimal contracts may be 
quite pronounced; (2) increases in risk aversion tend to lead to the 
payment of fuller compensating differentials; (3) employees may care 
about job attributes under optimal contacts even when, contrary to 
the simple model just presented, job attributes are not an argument of 
employee utility functions; and (4) these models lead to plausible 
predictions about the kinds of preferences among job characteristics 
that employees may systematically show. 


Example 1: Preference for Good Working Conditions 

Let r > 0 denote either working conditions or on-the-job consump¬ 
tion 3 and let s: 0 denote the wage or at-home consumption. Sup¬ 
pose that the employee’s ordinal preferences have the Cobb-Douglas 
form x‘*u> and that his coefficient of relative risk aversion for wage 
gambles is the constant p > 0, so that the employee is risk averse. 
T hese cardinal preferences are represented by U(x, w) = a ln(x) + 
In (w) in case p = 1 and by U(x, w) - (x“tc)‘ ' *7(1 - P) in case P 1. 
Suppose that the employee’s initial reservation utility level is ii 1 -p /(l 
- P), with u > 0 and u = -*. Then the solution to the contracting 
problem (CP) is u>(x) = Xx a< 1 ~ for some constant X that depends on 
the parameters a, p, u, and (pi,..., p s ). Notice in particular that if p 
< 1, then w(-) is actually an increasing function of x: This establishes 
that there is no general tendency for optimal contracts to pay even 
partial compensation for unfavorable working conditions. 

In this example, the ordinal utility associated with job x can be 
measured by x a w{x) = Xx“ /p ; it increases in x for any level of risk 
aversion. T his last observation is a special case of a general result that 
has been derived by several authors including Chari (1983), Green 
and Kahn (1983), and Bergstrom (1986). Their result, applied to this 
model, holds that if on-the-job consumption is a normal good, then 
the optimal wage contract will always lead employees to prefer jobs 
with higher x. 

In example 1, as the coefficient of relative risk aversion p increases, 
the ordinal utility measure x a w(x ) = Xx o/p becomes increasingly flat 
and converges to the constant u. With increases in risk aversion, the 

* Stafford and Cohen (1974) supplied one of the earliest economic treatments of on- 
the-job consumption in a study of how work effort varies during the workday. 
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optimal contract pays higher wages in bad jobs and lower wages in 
good jobs until, in the limit as relative risk aversion tends to infinity, 
full compensating differentials are paid. The proposition proved be¬ 
low generalizes the example and establishes that, for fixed ordinal 
preferences represented by a smooth utility function U(x, w) that is 
concave in w, if increases in risk aversion cause the wages to rise in 
some jobs and to fall in others, then the wages rise in “poor jobs” and 
fall in “good jobs.” 

Let U(x, w) represent the preferences of the less-risk-averse em¬ 
ployee and V(U(x, w)) the preferences of the more-risk-averse em¬ 
ployee. Assume that U w > 0, U ww < 0, V' > 0, and V" < 0. In the finite 
state model, we may assume without loss of generality that V’(u ) —* 
—« as u —* —« and V'(u) — * 0 as u —* + ». The reservation utility 
levels for the two problems are u and i>; no assumption is made about 
how they are related. When an interior optimum to the two optimal 
contracting problems is assumed, the marginal utilities of income 
across assignments are equalized for each of the two agents: U„,(x, 
w(x)) = X and V'(U(x, u<(x)))U w (x, w(x)) = p for all x, where «»(■) and 
w(-) are the respective optimal wage schedules. 

Theorem 2. There exists u* such that, for all x, u* s f/(x, w(x)) if 
and only if w(x) s w(x). That is, as the employee grows more risk 
averse, the ordinal utility levels associated with each assignment are 
contracted toward a level u* by raising wages in assignments with 
lower utility and reducing wages in assignments with higher utility. 

Proof. Fix u* so that V'(u*) = p/A. Then (since L\„ is positive and V 
is decreasing) w* s U(x, w(x)) if and only if p = V'(l’(x, u>(x)))U w (x, 
w(x)) s (p/X)f/„,(x, w(x)), which holds if and only if U u .(x, v>(x)) 2 X = 
U u .(x, w(x)). Since U wu , < 0, this holds if and only if w(x) 2 u>(x). Q.E.D. 

Under the additional assumptions that f' = V'(«) and that 
- V"(-)/V'(-) is bounded below by a constant r, it can be shown that 
U{x, il>(x)) —► u as r —* <»; that is, as the lower bound on the coefficient 
of absolute risk aversion lends to infinity, wages tend to compensate 
fully for variations in working conditions. 


Example 2: Preference to Accumulate Human Capital 1 

This example is a variation on example 1 in which the relevant attri¬ 
bute of the job, contribution to general human capital, is not a direct 
argument of the worker’s utility function. 

Suppose that the employee has a two-period life. His productivity 
in period 1 is p; in period 2 either it is p again or, if he has increased 
his human capital in the first period, it is q > p. There is no firm- 


■’ This model and iu analysis are adapted from Harris and Holmstrom (1982). 
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specific human capital, so the worker’s productivity does not depend 
on whether he remains with his initial employer. There are two possi¬ 
ble events in the first period. In the first event, which arises with 
probability r, the employer will assign the worker to a task that in¬ 
creases his second-period productivity to an amount q > p. In the 
second, which occurs with probability 1 - r, the worker is assigned to 
a task in which human capital is unchanged, so the worker’s second- 
period marginal product will be p. 

At the beginning of the second period, the worker is f ree to quit the 
firm and go to work elsewhere for a wage equal to his current mar¬ 
ginal product. This mobility imposes a lower bound on the wage the 
worker can be paid in the second period. However, there are some 
market frictions: The employee cannot leave during the first period 
after learning his job assignment. 

Let it; be an increment to the firm’s revenues when event; occurs 
and w t) the corresponding period i wage. Assume that competition 
among similar firms drives the expected wage over the two-period 
contract to be equal to the worker’s expected marginal product over 
that period, which is rq + (2 - r)p. The model also assumes that the 
worker can neither borrow nor save (although only the no-borrowing 
constraint is in fact binding) so that his consumption is equal to his 
income in each period. Competition among employers will lead them 
to offer an efficient contract, one that maximizes the worker's utility 
subject to the maximum expected wage constraint and the constraints 
on second-period wages: 

maximize r[u(ze lt ) + u(w 2 i)j + (1 - r)[u(w'i 2 ) + w(w 22 )l (3) 

subject to 

r(wn + a> 2 j) + (1 - r)(u>i 2 + w' 22 ) = rq + (2 — r)p, a; 2 i — 7> w’aa ~p> 

where u is some strictly concave function. 

This is a concave maximization problem with linear constraints, so 
its optimal solution is fully characterized by a first-order condition. It 
is not hard to verify that the unique optimal solution has u' 2 i = q and 
w ii = w ia = ui 22 = p with Iagrange multipliers of u'(p), r[u'(p) 
- u'(q) ], and zero, respectively, on the three constraints. Thus in each 
period the employee is paid his current marginal product. An em¬ 
ployee who is fortunate enough to lie assigned to job 1 acquires valu¬ 
able human capital but suffers no offsetting wage reduction under 
the terms of the optimal contract. Consequently, employees prefer 
job assignments that increase their human capital. The employer’s net 
profit under the contract in event / is precisely it r an amount unre¬ 
lated to the employee’s human capital acquisition. So the employee’s 
interests may conflict with the employer’s. 
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Example 3: Preference for “Critical" Jobs 

The final example is a simple “efficiency wage” model. According to 
efficiency wage theories, the productivity of an employee is an in¬ 
creasing function of his wage, so employers may find it optimal to pay 
a wage exceeding the market-clearing level. Higher wages may in¬ 
crease productivity for a wide range of reasons; for example, they 
may encourage employees to work more diligently or they may attract 
better applicants or reduce employee turnover. Several of the impor¬ 
tant papers in the efficiency wage literature are reprinted in a volume 
edited by Akerlof and Yellen (1986) together with a helpful survey by 
the editors. 

The purpose here is to note that the same factors that make an 
employer choose to pay wages in excess of market clearing may also 
make it choose to pay different wages for different jobs in a way 
unrelated to employee qualifications so that employees will care about 
how those jobs are assigned. In particular, it is shown below that 
wages are positively related to the costs of job turnover since higher 
wages reduce costly turnover. 

Thus assume that the gross profits earned when x, occurs are ir, if 
the employee works and ir, - A, if he quits. The agent is assumed to 
be a risk-neutral expected wage maximizer: His utility is w if he works 
in job i at wage w, g + b if he is laid off and receives a layoff bonus b, 
and g + 1)q if he quits and receives bonus b^. The variable £—the 
employee's outside opportunities—is privately observed by the em¬ 
ployee after the job is assigned and is drawn from a distribution F 
with a density function/that is continuous and positive on the interval 
(0, £). There is no bonding of employees, and the employer cannot 
penalize the employee for quitting; that is, b, b q s (). 

To ensure an interior optimum for the contracting problem, as¬ 
sume thatg > max, A, > min, A, > 0. To have the optimum character¬ 
ized by first-order conditions, also assume strict quasi concavity of the 
objective (4) below for all values of A; this amounts to the assumption 
that w + [F(w)!f(w)\ is increasing in v. 

If the employer’s only instrument were to set wages v\ to pay in 
each event and a termination bonus b to pay to departing employees, 
the employee would quit whenever his outside opportunities were at 
least Wi — b. The problem could then l>e written in the form 

max£p,{(ir, - u>,)F{Wi - b) + (it, - A, - b)[\ - F(w, - fc)]} (4) 


subject to 



w,F(u>j 


b ) + ( (g + b )f(g)dg 2: «• 

J »», — h 
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The wage policy w, that maximizes (4) takes the form w, - u/(A,). Since 
xv, does not depend on tt„ there is no necessary relation between the 
interesLs of the employer and those of the employee. Thus, as in the 
previous models, arbitrarily severe conflicts of motives can arise be¬ 
tween the employer and the employee. The heuristic optimal wage 
policy satisfies the rearranged first-order condition 

(5) 

where X, the Lagrange multiplier of the single constraint, is the mar¬ 
ginal cost of providing an extra dollar of expected income to the 
employee. It is clear that X cannot exceed one. Hence, the right-hand 
side of (3) is increasing in w„ so wages increase with A,: Employees 
prefer to occupy “critical” jobs in which turnover is costly to the em¬ 
ployer. 

In an appendix of the working paper version of this paper (Mil- 
grom 1987), I present a full formal analysis of this problem without 
the restriction to simple wage policies used above. In the full model, 
the employee may report his outside opportunities to his employer, 
but the truthfulness of any report cannot be assured. The employer 
can take account of the report in setting wages and termination bo¬ 
nuses and in making layoff decisions; it can also randomize on the 
basis of the report. The upshot is that none of these additional op¬ 
tions is useful to the employer and that the heuristic analysis given 
above yields the right answer: 

Theorem 3. The employer has an optimal policy that requires no 
randomization or reporting by the employee. The policy establishes a 
termination bonus b and, for each assignment i, a wage w,\ the em¬ 
ployee quits whenever his outside option pays more than at,- — b. 
Under this optimal policy, the wage w, — v/(A,) is an increasing func¬ 
tion of A,. 


III. When Does It Pay to Restrict Management 
Discretion? 

We now consider a simple model of influence in which the employee 
allocates his available time T between two activities, a directly produc¬ 
tive activity and an influence activity. If the employee spends time t at 
the directly productive activity, then his output will be “high” with 
probability p(t) and “low” with probability 1 - p(t). The organization 
will earn an extra profit of ir if output is high. Assume that p'(l) is 
continuous and strictly positive and 0 < p(t) < 1 on [0, T]. 
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If the employee spends time s at influence activities and the central 
decision maker has discretion to authorize a change from the status 
quo, then the change will be authorized with probability q(s) and the 
expected increment to profits from added flexibility in decision mak¬ 
ing will be ly(s). Assume that q'(s) and y(s) are continuous and strictly 
positive on [0, T]. The positive parameter I measures the “impor¬ 
tance” of the decision in terms of its potential to improve profits. 

The employee’s preferences are specified by a utility function that 
provides utility of u(w) for a wage w in the status quo and u(w) + k for 
a wage w when a change in conditions is approved (k > 0). Assume 
that u is defined on [0, »), that u' > 0, and that u” < 0. With these 
preferences, the employee has no actual aversion to spending time in 
productive activities. Formally, that distinguishes this model from the 
moral hazard models studied by Harris and Raviv (1979), Holmstrom 
(1979), Grossman and Hart (1983), and Holmstrom and Milgrorn 
(1987). However, this is a moral hazard model because if manage¬ 
ment has discretion to change the status quo, then there is an oppor¬ 
tunity cost to other workers’ time: Time spent in production is un¬ 
available for influence activities. 5 This is represented by the 
constraints s + t s T and a, (2 0. 

The wage paid may depend on the decision (change or no change) 
and on the employee’s output performance (high or low). There are 
four possible decision-performance outcomes. Individual outcomes 
are denoted by t and their corresponding probabilities and wages are 
denoted by />,(*, t) and w,. 

When the executive has discretion, a rational, self-interested em¬ 
ployee will seek to 

maximize tp,( a, t)u(w,) + q(s)k (6) 

\,t 

subject to s + t s T and s, l > 0. The social objective is given by 

lp,u(u\) + X[/y(.v) + p(!)Tt ~ 2/w], (7) 

where X > 0. This objective is a positively weighted combination of the 
firm’s profits and the employee’s utility, but it excludes the employee’s 
utility increment h. Excluding k from the social objective represents 
the assumption that this employee’s gain is a loss to some other em¬ 
ployee who is accorded equal weight in the social calculus. Thus k 
denotes the magnitude of the redistributional effect of any decision. 

5 Holmstrom and Ricart i Costa (198G) have emphasized that moral hazard does not 
require that employees be averse to hard work. In their model, a manager’s career 
concerns can lead him to make investment decisions different from those his employer 
would like, which leads the employer to adapt its capital budgeting procedure to al¬ 
leviate the incentive problem. 
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Fix a time allocation (s, t) and let V(s, t) be the optimal value of the 
corresponding “implementation problem”: 

maximize 'Lp i [u(w,) — Au»,j + kp(t)i r (8) 

subject to (s, t) solves (6). In standard fashion, V(s, t) is an upper 
semicontinuous function on [0, T] x JO, T]. 

With this notation, the social problem can be expressed as max., i( V(,v, 
I) + \Iy{s). The value of this transformed social objective is increasing 
in /, so the optimal value is increasing in / as well. 

When there is no decision maker with authority to alter the status 
quo, the maximal social payoff is 

V’ = max u(w) - kw + p(T)it. 


Lemma. V > max{V(s. <)|s + < s T, s, t 0}. 

Proof. First, we claim that V > V(0. T). Indeed, V is the maximal 
value of the relaxed version of (8) with s = 0, / = 7\ and the incentive 
constraint—that (s, C) maximizes (6)—omitted. The unique optimum 
of the relaxed problem has u'(wj) = k for all i. But then (0, T) does not 
maximize (6), so the optimal value of the constrained problem is less 
than the optimal value of the relaxed problem: V' > V’(0, T). 

Next we claim that V > V(s, t) for all (t, l ) with t < T. Indeed, the 
optimal value of the relaxed version of (8) with the incentive con¬ 
straint omitted is obtained by setting u'(u>,) = A. for all i, which yields 
the optimal value V + A [p(t) - p(T) Jir. Since Ait p' > 0, this is less than 
V for all t < T, as claimed. 

Finally, since V is upper semicontinuous, there exists a pair (s*, /*) 
such that 

V r (A*, l*) = max{F(s, /)|.v + t < 7’, s. t 2 0}. 

Whatever that pair is, V(s*, t*) < V. Q.E.D. 

I he optimal value achieved when management has no discretion to 
authorize a change is V, and max^Ffs, /) + kly{s) is the optimal value 
when management does have discretion. In view of the lemma and 
the boundedness of ■</(•), it is clear that as I approaches zero it is best to 
restrict management discretion. 

Theorem 4. There exists a pair of parameters (/, k) such that when 
these parameters prevail, it is better to eliminate discretion than to 
provide wage incentives to limit influence activities. Moreover, if (/, k) 
is such a pair and if /' s / and k' 2 : k, then (/', k') is another such pair. 

Proof. The arguments preceding the theorem establish all its asser¬ 
tions except the assertion that if discretion is optimally permitted for 
the parameter pair (/, k) and if k < k, then discretion is optimally 
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permitted for the pair (/, k). For this, it suffices to show that, for all (.v, 
t) and k, V(s, t\k) s V(s, l|A), where the notation now notes explicitly the 
dependence of the optimal value of (8) on the parameter k. 

Suppose that {w,} solves (8) for parameter value k and let u — 
lpiu(w,). For k< k, define i b, by u(w,) = [1 - (A/A)]u + (k/Jc)u(w,}. Let 
U(s, t\k, w) = 2/>,(s, t)u(wi) + q(s)k and define U{s, t\k, w) similarly. 
Then, for all ( s , t), 

U(s, t\k, <b) = (l - t\k, w ) (9) 

so that if (s, t) maximizes U(s, w), then it also maximizes U(s, t|fc, £>)• 

Since 1 is convex, by Jensen's inequality 

u~ l (u) = u~ 1 < IptW,. (10) 

Applying Jensen’s inequality and substituting from (10), we get 

IpjWi = IpiU ' 'J^l - ■+- 

- ^ ( ! - x)“ t(s) + (1,) 

== 2 p,w r 

It follows from (9), (11), and the definition of V(-) that V’(s, l\k) 
2: V(j, /|A), as required. Q.E.D. 

A number of assumptions have been incorporated into the model 
to keep things simple, and one may well wonder: How far can these 
be relaxed? First, the restriction to two output levels (high and low) is 
plainly dispensable; what is important for the argument is only that 
the moral hazard problem be severe enough that the first-best is unat¬ 
tainable when management discretion is unlimited. 

Second, we have assumed that the q(s) and y(s) functions are given 
exogenously so that the decision criterion to be used by management 
is not a choice variable of the problem. If there are several possible 
decision criteria but these cannot be committed to ex ante (perhaps 
because it is hard even to describe a standard of evidence that will be 
required), then once again the decision criterion is not a choice vari¬ 
able, and theorem 4 holds precisely as stated. 

Third, the model has been set up with ft as a purely redistributional 
parameter. We would have reached a conclusion similar to theorem 4 
if we included k in the social objective in the following way. Let /, 
formerly a positive parameter, be allowed to lake negative values as 
well. Define a social importance function f(s) = kq(s) + I’he 

costs of unlimited discretion still depend only on k and the gains only 
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on /(•)• Then a result resembling theorem 4 can be obtained in terms 
of the social importance function / and the real parameter A. 

Finally, several of the assumptions made here have been relaxed by 
Milgrom and Roberts (1987A). who studied influence activities by 
workers seeking a desirable job assignment. Their model includes the 
possibilities of competition among workers, promotions as rewards 
for past performance, and decision rules that are chosen in advance 
by management. Despite these differences, their conclusions rein¬ 
force the general finding that central decision makers ought not al¬ 
ways be allowed full discretion to make optimal decisions given the 
facts at hand since that leads to excessive influence activity. 

IV. Related Literature 

Williamson (1985) and Grossman and Hart (1986) have offered an 
alternative explanation of the diseconomies of centralized control that 
emphasizes the hazards that arise from opportunistic behavior by the 
owner-managers of integrated firms. These theories complement the 
one presented here, which emphasizes distortions in the behavior of 
those who inform and advise the executive that accompany increases 
in executive authority. 'These two theories are among those integrated 
into a general transaction costs framework bv Milgrom and Roberts 
(1987a). 

The theory here can be viewed as an extension of the rent-seeking 
theories developed by Tullock (1967), Krueger (1974), Posner (1975), 
Buchanan (1980), and Bhagwati (1982). These theories hold that gov¬ 
ernment-granted subsidies, tariffs, and monopolies impose welfare 
losses on society because they lead businesses to waste resources in 
attempts to win tariff protection or monopoly rights for themselves. 
The analyses all indicate that government interventions ought some¬ 
times to be limited in order to discourage wasteful rent-seeking ac¬ 
tivity. 

The analysis here extends these theories in two principal respects. 
First, this is explicitly a cost-benefit analysis. Because those most af¬ 
fected by a decision are often among those best informed about the 
alternatives and their consequences, or are at least best motivated to 
discover and analyze the alternatives and their likely consequences, a 
reasonable theory must allow the possibility that the activities of rent 
seekers can lead to better decisions. Second, the scope of the theory is 
expanded beyond the public sector. 'There are tremendous payoffs in 
the private sector from “salesmanship”—both the actual commissions 
earned by salesmen and the gains to having one’s ideas accepted or 
projects adopted or performance evaluated favorably. This analysis 
substitutes a broad focus on the costs of centralized authority for the 
usual narrower focus on the costs of government intervention. 
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Within firms, influence activities can be controlled by compensation 
policy, by limiting access to the decision-making process, or by lim¬ 
iting management discretion. In public-sector decision making by 
regulatory bodies and legislatures—especially in a society in which 
public access to decision makers is regarded as a matter of right—the 
corresponding instruments to control lobbying and influence are 
weaker. Consequently, influence costs are likely to be higher in the 
public sector than within firms. Thus one argument in favor of small 
government bureaucracies is that they limit the ability of the govern¬ 
ment to process decisions and so represent a way to restrict the gov¬ 
ernment’s discretionary decision powers. 


V. Applications 

The approach here points to possible economic explanations for and 
analyses of phenomena traditionally studied by sociologists as well as 
to new analyses of some traditional economic problems. Here are just 
a few examples. 

1. Resistance to change. —As we have seen, employees in even the 
best-run firms are rarely indifferent about matters that affect their 
working conditions or job content. Employees can be expected to 
resist those changes that threaten to leave them less well off by failing 
to cooperate in the search for better ways to do business or by subvert¬ 
ing changes in the hope of restoring the older order. This rent- 
seeking theory contrasts with noneconomic theories in the way it 
identifies the sources of resistance, the kinds of changes that it pre¬ 
dicts will be most vigorously opposed, and the strategies that it pre¬ 
dicts will be adopted to overcome resistance by successful firms in 
rapidly changing environments. (See Milgrom and Roberts f 19876] 
for a more extensive analysis.) 

2. Vertical integration. —When a firm’s key suppliers are not perfect 
competitors (i.e., their prices exceed their marginal costs), they may 
incur excessive selling costs and impose decision costs on the buyer in 
their attempts to earn the rents associated with marginal sales. All 
these costs are influence costs that can be reduced or eliminated by- 
vertical integration (which restricts the buyer’s discretion about from 
whom to purchase). Any gains realized in this way must be balanced 
against the losses from reduced discretion and the costs of newly 
centralized authority over other decisions in the integrated organiza¬ 
tion. 

3. Takeover bids/golden parachutes. —According to Jensen and 
Ruback (1983), empirical evidence indicates that the stockholders of 
the acquirer do not earn conspicuous excess returns. Thus the eco¬ 
nomic motive for takeovers may well be the increased rents earned by 
the management of the acquirer, for example, because their in- 
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creased authority in the merged firm makes their jobs more “critical” 
in the sense of example 3 of Section II. Mere transfers of the rents 
earned by the former management to the shareholders and new man¬ 
agement do not enhance efficiency, and that part of takeover activity 
by the acquiring firm and defensive activity by the target firm’s man¬ 
agement that is simply redistributive is wasteful. Golden parachutes, 
properly designed, are executive compensation packages that force 
potential acquirers to reimburse former managers for any lost rents 
when there is a transfer of control. These discourage inefficient 
takeovers and reduce both rent seeking by potential acquirers and 
rent-protecting behavior by existing management. The consequent 
efficiency gains ultimately benefit the shareholders. 

4. Litigation policy .—A court trial is a centralized decision process in 
which the disputants often incur huge costs to effect a redistribution 
of wealth. As “bright-line" law fades and parties become less sure of 
the likely outcome of litigation, the discretion of juries and judges 
correspondingly rises/’ Damage rules play the role that wages played 
in this study of influence within firms: Rules limiting damages reduce 
influence costs at the expense of other objectives such as paying “just" 
compensation or creating efficient incentives for contractual per¬ 
formance. 


VI. Concluding Remarks 

The economic environment described here dif fers markedly from the 
neoclassical, perfectly competitive spot contracting environment in 
which buyers are indifferent at the margin about what they buy, 
sellers are indifferent about incremental sales, and workers are indif¬ 
ferent about employer decisions. Instead, people care about decisions 
and attempt to influence them. When decision makers are honest and 
rational, influence takes the form of suggesting alternatives and sup¬ 
plying information, opinion, and analysis; when they are not, in¬ 
fluence may take more insidious forms. Efficient organization design 
seeks to do what the system of prices and property rights does in the 
neoclassical conception: to channel the self-interested behavior of in¬ 
dividuals away from purely redistributive activities and into well- 
coordinated, socially productive ones. The success that a society's in- 


8 Even those who are firmly bound to pursue a fixed objective or to adhere to a fixed 
set of rules have discretion to the extent that they may exercise judgment in interpret¬ 
ing and applying rules, admitting and evaluating evidence, resolving ambiguities, etc. 
For judges and juries, conflicting precedents and novel circumstances result in in¬ 
creased discretion for decision making, which makes it possible for interested parties to 
profit from what I have dubbed “influence activities.” 
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stitutions have in achieving this objective is a major determinant of its 
economic welfare. 
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Takeover Threats and Managerial Myopia 


Jeremy C. Stein 

Harvard University 


This paper examines the familiar argument that takeover pressure 
can be damaging because it leads managers to sacrifice long-term 
interests in order to boost current profits. If stockholders are imper¬ 
fectly informed, temporarily low earnings may cause the stock to 
become undervalued, increasing the likelihood of a takeover at an 
unfavorable price; hence the managerial concern with current bot¬ 
tom line. The magnitude of the problem depends on a variety of 
factors, including the attitudes and beliefs of shareholders, the ex¬ 
tent to which corporate raiders have inside information, and the 
degree to which managers are concerned with retaining control of 
their firms. 


I. Introduction 

The current wave of corporate takeovers has intensified the debate 
over their social desirability. On one side of the fence stand the raid¬ 
ers, along with many economic and legal scholars, arguing that 
takeovers serve two important functions: First, they allow acquiring 
firms to generate economies of scale or scope, apply superior knowl¬ 
edge or skills, or otherwise create a value-improving synergy. Second, 
the very threat of takeover disciplines entrenched management, serv¬ 
ing them notice that they are liable to be ousted if they do not act in 
the best interests of their shareholders (see, e.g., Grossman and Hart 
1980; Kasterbrook and Fischel 1981; Scharfstein 1985). 

1 hose less enthusiastic about takeovers have raised a number of 
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counterpoints. One line of objection to unfettered takeover activity 
that has received a good deal of attention is the “managerial myopia" 
argument . It contends that takeover pressure, and the accompanying 
fear of being bought out at an undervalued price, leads managers to 
focus more heavily on short-term profits rather than on long-term 
objectives. Kuttner (1986, p. 17) makes the point as follows: “Beyond 
the problem of excessive borrowing, one also must consider what the 
casino mentality does to the entire corporate culture. In a world 
where whole corporations are prey, the manager who plans for the 
long term is a sucker .... Takeover fears only intensify the obsession 
with the quarterly bottom line. For when reported profits drop, the 
stock may become undervalued, making it an easier target." In a 
similar vein, Auletta (1986, p. 288) remarks: “In such a climate [when 
takeovers are prevalent] companies often find their attention diverted 
to short term, defensive stances . . . peddling assets [and] reducing 
long term capital investments in order to stretch fourth quarter earn¬ 
ings” 

The goal of this paper is to develop a formal model of the phenom¬ 
enon of managerial myopia described above. In so doing, it is hoped 
that some light will be shed on the following questions: (1) Can mana¬ 
gerial myopia be consistent with rationality on the part of sharehold¬ 
ers? How can anything that is not in the best long-run interests of the 
firm be used to increase the stock price? (2) How bad a problem is 
managerial myopia? In particular, are its negative effects ever strong 
enough to fully offset the positive synergy benefits associated with 
takeovers so that it might be socially desirable to ban takeover activity? 
(3) Are takeover threats the sole cause of managerial myopia? Or is 
the often impatient behavior of some stockholders (e.g., portfolio 
managers who may dump a company's stock as soon as its earnings 
reports are not quite up to par) also partly responsible? (4) In what 
way does managerial self-interest enter the problem? (5) How are the 
answers to these questions affected by the assumptions concerning 
how well informed corporate raiders are? Do raiders with “inside 
information” necessarily make matters worse? 

In order to answer question 1 in the affirmative, one would have to 
appeal to some sort of informational asymmetry. If stockholders ob¬ 
serve everything that managers do, any policy that management 
knows is not in the best long-run interests of the firm would lower the 
stock price. If, on the other hand, stockholders cannot observe all the 
inner workings of the firm and must rely on some imperfect summary 
statistic such as reported earnings, there is room for the type of costly 
signaling described by Spence (1973). Managers might, for example, 
be able to boost the stock price by selling off productive assets whose 
value shareholders are unable to gauge properly. If left unsold, the 
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assets may have little effect on current earnings and may be under¬ 
valued by shareholders. Consequently, their sale, which has an im¬ 
mediate impact on the bottom line, may cause an upward revaluation 
of the company’s stock. 

In the absence of short-term pressures, there is no strong motiva¬ 
tion for managers to devote resources to making sure that their stock 
is never undervalued. After all, the productive assets mentioned in 
the example above will eventually start to yield earnings so that the 
undervaluation will be transient. The signaling behavior described 
above becomes important when there is a chance that raiders will 
exploit temporary mispricings of the stock and buy the company at a 
price that managers consider to be unfairly low. In such cases, man¬ 
agers who boost their stock prices by inflating earnings may be at¬ 
tempting to act in the interests of stockholders by preventing them 
from being unfairly “ripped off" by raiders. However, as will become 
clear shortly, such attempts are often misguided, resulting in ex ante 
losses to shareholders, who could possibly be made better off by bind¬ 
ing managers to never “interfering” with the stock price. 1 (Of course, 
stock price boosting may also be undertaken by managers who have 
no intention of helping shareholders, but who just want to discourage 
takeovers so they can keep their jobs.) 

Although takeover threats provide an important motivation for 
managerial myopia, they need not be a sole determinant. If one be¬ 
lieves that managers attempt to pump up current earnings so as to 
avoid takeovers at undervalued prices, one must look not only at the 
extent of takeover pressure but also at the factors that may cause 
stocks to be undervalued in the first place. Here the behavior and 
beliefs of stockholders come into play. Relatively patient stockholders 
may not be discouraged by a low earnings report; they may attribute it 
to a policy of long-term investment by the firm. If patient sharehold¬ 
ers are the norm, low earnings will not lead to a large undervaluation 
of the stock, and managers will not need to be overly concerned. 
Impatient shareholders, on the other hand, may become very dis¬ 
tressed by low earnings reports and may try to dump a stock as soon 
as such a report is issued. If such impatience is widespread, managers 
will be more fearful of undervaluation and the accompanying possi¬ 
bility of rip-off by a raider. Hence efforts to boost current earnings 
will be more intense. 

Given the preceding discussion, one would be tempted to conclude 
that raiders who are better informed than the average shareholder 

' In this context, the problem of managerial myopia can be seen as a symptom of an 
imperfect contract between shareholders and managers. Ideally, shareholders might 
hope to write a c ontract that binds managers to never signaling. However, this is likely 
to tie impossible in practice, as is explained in n. 6. 
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must cause more problems than raiders who are not. Wei 
raiders are more likely to pounce on an undervalued s' 
should tend to make managers more defensive in the ft 
raiders. Although there is some truth to this notion (; 
when manager and shareholder interests do not coincide), 
apply under all circumstances. As will be seen shor 
informed raiders can often lead to more optimistic outc 
lesser-informed ones. 

The paper is organized as follows: Section II present 
model, and Section III examines the equilibria that result 
ers are “uninformed.” Section IV briefly considers the c 
formed raiders." Finally, Section V discusses some of the 
empirical implications that emerge from the analysis. 

II. The Model 

The model has three periods. At time 1, the managers ol 
Oil Company learn how much oil their exploration acti 
uncovered. In the good state, which occurs with probabi 
have x ( barrels, and in the bad state, which occurs with 
1 - p, they have x 2 barrels, with x 2 < xp Shareholders do r 
which state prevails and must use the ex ante probabilities p 
in valuing the firm. 

The managers can either sell oil today or wait until timt 
“long-term" asset, in the following sense: While the marke 
remain constant over time. Acme is in the midst of de 
technology that will allow it to refine the oil more cheaply (\ 
be done before it can be sold). This technology will not be i 
time 3. Thus while the profit from selling oil today is $1 : 
the profit from waiting until date 3 is $( 1 + rj per barrel 
real interest rate is taken to be zero, waiting is the long-i 
maximizing strategy. 2 Managers may attempt to signal the 
ever, by selling today in order to boost current earnings 
sumed that selling oil is the only feasible way to gen< 
earnings; any other methods involve prohibitive costs. 

The model rules out methods of signaling other than th 
rent earnings. This is not meant to imply that such signals 

2 The “long-term” asset formulation used here is very similar to that e 
different context by Diamond and Dybvig (1983). 

* These earnings are retained by the firm until date 3. In this model, the 
for the type of dividend signaling described by Bhattacharya (1979). 

* This assumption is used only to make the exposition more transpan 
not be so strong. Similar results would be obtained if the marginal cost < 
current earnings above were just required to be greater for the bad-st 
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financed stock repurchases) are never used, only that they too are 
costly, so that signaling through earnings is relevant at the margin. 5 
The model also ignores ex ante contractual solutions to the problem. 
One interesting possibility, which will be discussed later in the paper, 
is that the firm adopt in its charter a “supermajority” provision 
or other type of antitakeover amendment that gives management 
greater power to block unwanted takeover bids. 6 

The motivation for signaling is the presence of a raider, who inves¬ 
tigates the firm at date 2. In the course of his investigation, the raider 
will turn up a synergistic improvement v drawn from a probability 
distribution with cumulative distribution F(v); that is, the date 3 
profits of the firm will be augmented by an amount v if the raider 
takes control of the firm. The raider may also learn how many barrels 
of oil the firm has. This will be referred to as the “informed raider” 
case. If he does not know which state prevails and must use the same p 
and 1 - p probability assessments as the stockholders, this is known as 
the “uninformed raider” case. In both cases, it is assumed that once 
the raider investigates the firm, he has the option of attempting a 
takeover, at a cost c. The cost c can be thought of as representing the 
administrative and legal expenses incurred in making a bid. For ex¬ 
pository purposes, c will be used as a parameter that measures the 
degree of takeover pressure. Implicit in the discussion is the notion 
that such pressure can, to a degree, be controlled by policymakers: by 
erecting regulatory obstacles, they can effectively raise takeover costs 
in the form of lawsuits, increased costs of financing, and so forth. 


r ‘ The editor has suggested another signaling mechanism. Managers of good-state 
firms could announce that they were giving up their current wages in exchange for 
more stock in the firm, paying a higher effective price than they would be willing to pay 
if the firm were bad. However, such a scheme requires that a great deal of information 
be verifiable by outsiders. The more slock a bad-stale manager initially owns, the higher 
effective price he will pay for a few more shares if that payment enables him to boost 
the price of all his existing holdings before a takeover. Thus in order for this type of 
signaling to work, outsiders need to know a manager's initial holdings. This is further 
complicated by the fact that a manager who cares about other shareholders will behave 
as if he owned more shares than he actually does. These difficulties do not arise with 
signaling mechanisms in which the manager transacts for the firm's account rather than 
his own (i.e., boosting earnings, stuck repurchases) since, in such cases, the costs and 
benefits accrue to the same “base" (all the firm's equity) and a manager's share in the 
base is irrelevant. 

Another contractual possibility is binding managers to never signaling through the 
imposition of a fine when any profits arc observed at date I. Although this is not 
illogical in the current model, it does seem unrealistic to penalize managers for profits 
of any sort. In fact, it is possible to construct a more complex model in which such 
unnatural fines are ruled out endogenously. This can be done by assuming that, in 
addition to oil, managers sometimes uncover a perishable commodity that must be sold 
lor a profit at date 1 or totally wasted. If this occurs relatively frequently, it will not be 
desirable to penalize managers for date 1 profits. They must be given some discretion 
with respect to generating earnings. 
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For most of what follows, it will be assumed that there is no confli 
of interest between managers and current shareholders: manage 
own stock in the company and seek only to maximize their retui 
from this stock. In this case, the only differences between manage 
and other shareholders are that managers have better informatic 
about how the company is doing, and they have discretion as to ho 
to handle the company’s assets. A few brief remarks will also be mad 
however, about the case in which managers enjoy control of the fir 
for its own sake, so that their interests are out of line with those of th 
other stockholders. 

The model makes it clear why managers may choose to engage 
signaling at date 1. If stockholders have no information at this tint 
their best guess of the value of the firm is 

Vo = (1 + r)[px t + (1 - p)x 2 j. ( 

But suppose that managers know that the firm is in the gootl stat 
Then from their point of view, the stock is underpriced, and it 
possible that a raider may lie able to rip them off by acquiring all if 
stock for less than its true value of xj(l + r). Although selling oil • 
date 1 is wasteful relative to waiting until date 3, managers of a goot 
state firm may be willing to do it if it can cause an upward revaluatio 
in the firm’s stock, thereby forcing any raider to pay a fair price fc 
the firm. 

The managers’ decision whether or not to sell some oil at date 
depends on several factors: how much they have to sell to chan^ 
their market valuation, how large a revaluation they can produce, an 
how likely a raid is at date 2. 

The basic idea running through the entire analysis is this. Facilita 
ing takeovers (as parameterized by lowering the cost of takeovers 
has two conflicting effects on social welfare: it increases the number ( 
synergistic mergers that are consummated, but it also leads to it 
creased wasteful signaling, in both firms that are eventually take 
over and those that are not. Consequently, welfare does not improv 
monotonically with decreases in c; after a point, the negatives ca 
begin to outweigh the positives. This nonmonotonicity result rur 
counter to the spirit of Grossman and Hart (1980), Easterbrook an 
Fischel (1981), Scharfstein (1985), and many others and suggests th; 
allowing complete freedom in the market for corporate control m; 
not be an optimal policy. 

This notion would be much reinforced if one could also make tf 
stronger claim that the existence of takeovers can be ex ante welfai 
reducing, that is, that social welfare is sometimes lower for a finite 
than it is when c is infinite and takeovers are impossible. As it turr 
out, such a claim can indeed be supported, although not with con 
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plete generality. There is a noteworthy exception, a case in which one 
can prove that any finite c leads to higher welfare than an infinite c. In 
this case, absolute abolition of takeovers would be worse than any 
other option. 

In order to simplify the analysis, the following strong assumption is 
made: A raider can always buy the firm at a price that is exactly equal 
to what shareholders perceive to be the current worth of the com¬ 
pany, in the absence of any improvement v. In the context of the 
dilution concept advanced by Grossman and Hart (1980), this corre¬ 
sponds to assuming that raiders are able to dilute the minority shares 
of an acquired target quite substantially and thereby capture all the 
surplus generated by their improvements. Although this assumption 
is not very realistic, it does not produce results that differ markedly 
from the more general case in which raiders capture only a portion of 
the surplus they generate. 

III. The Case of Uninformed Raiders 

The first case to be studied is that in which the raiders, like the 
shareholders, do not know which state of the world prevails until time 
3. Since raiders and stockholders are symmetrically informed, the 
stock price will always be a "fair" one to any raider, whether or not 
managers engage in signaling. That is, the stock price (assuming risk- 
neutral shareholders) will always equal a raider’s expectation of the 
current worth of the firm. Consequently, a risk-neutral raider who is 
able to capture the entire surplus created by his improvement v will 
make a takeover bid if v s c. So at date 1, a manager assesses the 
probability of takeover at date 2 as 1 - F(c), which will henceforth be 
denoted G(r). It should be noted that the probability of takeover does 
not depend on whether or not there is signaling. Again, this is be¬ 
cause signaling does not change the fact that the raider’s best guess of 
a stock’s worth is just its current price. 

We are now ready to construct the equilibria of the game. The 
equilibria that will be focused on here will be those that have the 
following properties: (1) they are Bayesian perfect equilibria: man¬ 
agers are required to be following the optimal action at date 1. given 
shareholder beliefs, and these beliefs are fulfilled by managers' ac¬ 
tions along the equilibrium path; and (2) they satisfy the intuitive 
criterion of Kreps (1985); that is, they are characterized by "rea¬ 
sonable” beliefs off the equilibrium path. 

With uninformed raiders, the model features what may be termed 
a discrete or “lumpy” signaling technology in the following sense: it 
will be impossible for a separating equilibrium to exist in which the 
good-state firm sells fewer than X 2 barrels in the first period. This is 
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because (1) both types of firms have equal incentives to pass them¬ 
selves off as good rather than bad or average; it raises both ofj:heir 
expected returns by the difference in stock price times the probability 
of takeover G(r,)\ and (2) both types of firms have the same marginal 
signaling cost below x 2 . Hence if separation does occur, it will involve a 
cost of no less than rx 2 ." The importance of lumpy separating costs is 
explained in the following proposition. 

Proposition 1. Depending on parameter values, with uninformed 
raiders there can be both pooling and separating equilibria that are 
Bayesian perfect and robust to the intuitive criterion. 

Thus this model differs from Spence’s (1973) for which Kreps 
(1985) showed that there was a unique, separating Bayesian perfect 
equilibrium satisfying the intuitive criterion. Let us first see when a 
separating equilibrium can be supported in which managers always 
signal in the good state. In such an equilibrium, shareholders have 
separating beliefs so that their Bayesian updating process goes as 
follows: (a) if they observe a profit of x 2 , they are certain that they are 
in the good state” and that the stock is worth x I ( 1 + r) - rx 2 ; (b) if they 
observe a zero profit, they are certain that they are in the bad state 
(since managers would have signaled had they been in the good state) 
and that the stock is worth x 2 (l + r). 

What are the conditions under which it will be optimal for the 
managers to fulfill the shareholder beliefs by signaling in the good 
state? If management signals at date 1, they will receive Xj( 1 + r) 
— rx 2 with certainty: either a raider will take the firm over at that price 
or there will be no takeover, in which case the total date 1 and date 3 
oil sales will yield that amount. If management does not signal, the 
stock price will be x 2 (l + r), and there is a probability G(c) of a 
takeover at this low price. On the other hand, if there is no takeover, 
which occurs with probability F(c), not signaling will have turned out 
to be a fortunate strategy, for total date 3 oil sales will net xj( 1 4- r). 

Putting these considerations together, we can see that, given 
separating beliefs, it will be optimal for managers to fulfill these be¬ 
liefs by signaling in the good state when 

G(c)(l + r)(xi - x 2 ) - rx 2 2 0. (2) 

Inequality (2) implicidy defines the set of takeover costs (c) for 
which a separating equilibrium can be supported. If we denote by c« 
the point at which (2) is met with equality, this set is simply all c for 

7 This is in contrast to the “smooth” example in Spence’s (1973) paper, in which the 
marginal cost of signaling is everywhere lower for the good type, so that with the right 
parameter values one can construct separating equilibria with arbitrarily small signaling 
costs. 

“Actually, there is a trivial "openness" problem being ignored here: a profit ol 
slightly more than x\> is needed to establish that the firm is good. 



TAKEOVER THREATS 69 

which c s c s . In other words, if takeover costs are sufficiently low that 
the threat of takeover is high enough, there can be an equilibrium in 
which managers engage in the myopic signaling behavior. 

Next, check the conditions under which there can be a pooling 
equilibrium. At first glance, it would appear that pooling equilibria 
should always involve zero oil sales at date 1. Why should managers 
waste money when this expenditure does not help to distinguish their 
firm? However, it is in fact possible to construct pooling equilibria in 
which both types sell some small amount x < x 2 and that are not 
refined away by the intuitive criterion. 9 As it turns out, consideration 
of these somewhat unnatural equilibria does not alter any of the con¬ 
clusions to be sketched below. Thus for expositional simplicity, they 
are disregarded in what follows, and we will focus only on the more 
“reasonable” pooling equilibria in which date 1 oil sales are zero. 

In a pooling equilibrium, shareholders have a different Bayesian 
updating process. When they observe a zero profit, they do not con¬ 
clude that the firm is bad, but rather that there is only a 1 — p 
probability of the firm’s being bad since both good- and bad-state 
managers never show a date 1 profit. We also need to specify beliefs 
off the equilibrium path, that is, what shareholders would believe 
were they to observe x 2 . In principle, these beliefs can be almost any¬ 
thing since x 2 is never observed, and hence Bayes’s law need not 
apply. However, in order to construct pooling equilibria that satisfy 
the intuitive criterion, we must be more selective. It is straightforward 
to show that the only pooling equilibria that would survive the 
refinement process are those that have the following reasonable out- 
of-equilibrium belief: “If x 2 is observed, the state must be good.’’ 10 

Given pooling beliefs, managers do not have as strong an incentive 
to signal in the good state as they did under separating beliefs. Now, 
failure to signal depresses the stock price to only (1 + r)\px x + 
(1 - p)X‘j,] rather than tox 2 (l + r). As before, no signaling implies that 
there is a G(c) probability of takeover at the low price and a probability 
F(c) that there will be no takeover and revenues of X|(l + r). Signal¬ 
ing, on the other hand, would ensure a return of xj(l + r) - nt 2 . 
These considerations imply that it will be optimal for managers not to 
signal at date 1 if 

G(c)(l + r)(l - p)(x, - x 2 ) - rx 2 s 0. (3) 


These equilibria are supported by the following out-of-equilibrium belief: “If 1 
observe a firm with a profit of less than x, l will take it to be a bad firm with certainty." 

Imagine a pooling equilibrium in which a good firm would like to signal if it had to 
sell only x s but pools because out-of-equilibrium beliefs require much more to be sold to 
establish goodness. Such an equilibrium can be "broken” by the logic of Kreps since if a 
good firm actually did sell it would have to be judged good. A bad firm would never 
do so because it would find it prohibitively costly. 
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If inequality (3) is met, then a pooling equilibrium tan be sustained. 
This happens for all c s c p , where c p is the threshold point at which (3) 
is met with equality. If takeover costs are high enough so that jhe 
probability of takeover is relatively small, there can be an equilibrium 
in which managers take the long-run view and always wait until date 3 
to sell any oil. It bears repeating that such a pooling equilibrium is, by 
its construction, robust to refinement by the intuitive criterion. 

It is clear from an inspection of (2) and (3) that c p < r,. This leads us 
to the following conclusions: (i) If c is “low” (i.e., c < c p ), then the 
unique equilibrium of the game involves myopic signaling by man¬ 
agers. (ii) If c is “high" (i.e.. c > c,), then the unique equilibrium of the 
game is one with no signaling, (iii) If r is “intermediate” (i.e., r p s c 
s c s ), there can be two pure strategy equilibria. Separating beliefs on 
the part of shareholders can lead to separating behavior on the part 
of managers, and pooling beliefs can lead to pooling behavior. 

Before we proceed any further, it should also be noted that mixed 
strategy equilibria exist in the intermediate cost range. In a mixed 
strategy equilibrium, the manager sometimes signals in the good state 
and sometimes does not. Consequently, a shareholder observing zero 
profit attributes a probability a to the possibility that the slate is good, 
with a < p. With beliefs given by such an a. managers will lie indiffer¬ 
ent between signaling and not signaling (and hence be willing to pur¬ 
sue a mixed strategy) if 

G(c)(l + r)(l - a)(xj - x 2 ) - rx 2 = 0. (4) 

Equation (4) tells us that for each c in the intermediate range, there 
exists a unique randomizing scheme (parameterized by a) over the 
pure strategies (signal, do not signal) that supports a mixed strategy 
equilibrium. For the remainder of this paper, however, these mixed 
strategy equilibria will be accorded little attention. The primary em¬ 
phasis will be on the pure strategy equilibria. 

Looking at these equilibria, w'e can come to unambiguous conclu¬ 
sions when c is in the high or low range. However, when c is in the 
intermediate range, it is not clear what the outcome will be. If share¬ 
holders have pooling beliefs, there will be pooling. If they have 
separating beliefs, there will be separation. There seems to be no a 
priori reason why one equilibrium should be more likely than the 
other. 11 


11 It should be noted that both types of equilibria are stable in the following sense: 
Given separating beliefs on the part of all other shareholders, a long deviating share¬ 
holder will not do better if he has pooling beliefs and tries to buy shares in companies 
that others judge poorly. (If he has to pay even a tiny bit more than the market price for 
such shares, he will do strictly worse.) Conversely, a lone deviator to separating beliefs 
will not do better if everyone else has pooling beliefs. 
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An interesting interpretation of the multiple equilibrium situation 
is that, for a range of c’s, the detrimental effect of a given amount of 
takeover pressure depends on how patient stockholders are. If stock¬ 
holders have pooling beliefs, they do not think too badly of a com¬ 
pany that shows temporarily low profits. They realize that it may be a 
good company following a long-run strategy that calls for heavy in¬ 
vestment today. Given such beliefs on the part of stockholders, the 
company is indeed free to pursue long-run policies because such 
policies do not hurt the stock price greatly and hence do not cause an 
unacceptable expected loss in the face of takeover pressure. 

If, however, stockholders have separating beliefs, they judge a com¬ 
pany very harshly if it does not produce a current profit. They take it 
to be a bad company with certainty. With such beliefs, companies are 
forced into behaving myopically because failing to produce a profit 
causes an undervaluation that is unacceptably large given the level of 
takeover pressure. 

The idea that managerial myopia may depend on the altitudes of 
shareholders is often expressed by members of the business commu¬ 
nity and the press. For example, Greenhouse (1986) places some of 
the blame for increased shortsightedness by management on the 
growing percentage of stock held by “pension fund managers and 
other institutional investors [who arej generally more fickle than indi¬ 
vidual investors . . . and [who can] dump a stock literally moments 
after bad quarterly news is issued” (sec. 3, pp. 1, 8). 

The model of this paper can go only so far in rationalizing this 
quote because it does not specify a mechanism by which one of the 
equilibria is selected in the intermediate range. One cannot rigorously 
claim that there are exogenous differences between pension fund 
managers and individual investors that make the preferred pooling 
equilibrium more likely with the latter. One can say, in the context of 
the model, only that they have been lucky in getting stuck in this 
equilibrium. 

Even though the model fails to predict from exogenous considera¬ 
tions when shareholders will have pooling or separating beliefs in the 
intermediate cost range, it is still useful to distinguish the two pos¬ 
sibilities in the following way: We will say that stockholders are “pa¬ 
tient” if, when there is a choice, the pooling equilibrium is observed. 
I hus with patient stockholders, separating equilibria occur only when 
costs are low—when r < c p . Analogously, stockholders are “impatient” 
if, when there is a choice, the separating equilibrium is observed. With 
impatient stockholders, separating equilibria are easier to get: it is 
necessary only that c s c s . Hence patient stockholders lead to higher 
levels of social welfare when costs are in the intermediate range. 

When raiders are uninformed, we can draw the following general 
conclusions for social welfare. 
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Proposition 2. With uninformed raiders, (a) welfare is not mono- 
tonic in c, and ( b) takeovers can lead to ex ante welfare losses, whether 
shareholders are patient or not. 

As we lower the cost of takeovers r, we eventually hit a threshold (at 
Cp for patient stockholders, at c % for impatient stockholders) at which 
managers begin to behave myopically. This entails a discrete drop in 
ex ante social welfare in the amount prx 2 —the probability of signaling 
times the resources wasted in signaling. 

This discrete drop in welfare can evidently be quite large because it 
can lead to ex ante welfare losses. As an example of part b of the 
proposition, suppose that c - 0 and that v is nonstochastic and 
greater than zero. In this case, the probability of a takeover is one, 
and signaling will be assured (even with patient stockholders) if 
(1 — p)( 1 + r)(jt) - x 2 ) — rx 2 > 0. The expected synergy gains from 
takeovers are v, and the expected costs of signaling ar eprx 2 . Clearly, it 
is possible to have a case in which both v - prx 2 < 0 and the signaling 
condition above is met. In such a case, a world in which takeover costs 
are infinite is preferable to one in which they are zero. 


IV. The Case of Informed Raiders 

Let us now turn to the case in which raiders share the managers’ 
inside information about which state of the world prevails at time 1. 
In this case, the results about lumpy costs of separation and pooling 
equilibria disappear (along with any distinctions between patient and 
impatient shareholders). 

Proposition 3. There can be separating equilibria in which arbi¬ 
trarily small signaling costs (less than rx 2 ) are incurred by the good 
firm. 

Proposition 4. Pooling equilibria robust to the intuitive criterion 
no longer exist unless the cost of takeovers is so high that a good firm 
faces a zero probability of takeover in a pooling equilibrium. 

The propositions (which are formally verified in the Appendix) 
may appear surprising given that the cost of signaling is still the same 
for both types of firms below x 2 . However, it must be recognized that 
the benefit of a high stock price is now greater for good firms than for 
bad firms because, with informed raiders, a good firm that has a high 
stock price has a greater probability of being taken over than a bad 
firm with a high stock price. Informed raiders are relatively unlikely 
to pursue a target that they know to be overpriced. 

With the elimination of the pooling equilibria, the model now al¬ 
ways has a unique separating equilibrium that satisfies the intuitive 
criterion. The signaling costs incurred by the good firm in this equi¬ 
librium are straightforward to calculate. In equilibrium, the bad firm 
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earns a return of (1 + r)x 2 . If it had a high stock price (corresponding 
to being judged a good firm), its expected return would be given by 

(1 + r)[x 2 + G(z)(x i - x 2 )], (5) 

where z is defined as z = e + (1 + r)(x, - x 2 ). Note that the probabil¬ 
ity that a bad firm with a high stock price will be taken over, G(z), 
reflects the amount that the raider must overpay for the firm. The 
amount a bad firm would gain from boosting its stock price to the 
value of the good firm is simply (1 + r)G(z)(x| - x 2 ). For the equilib¬ 
rium to remain separating, the cost of signaling must exceed this 
value so that the bad firm has no desire to masquerade as the good 
firm. For the equilibrium to satisfy the intuitive criterion, the cost of 
signaling must be the minimum amount that accomplishes this sep¬ 
aration. Fudging the trivial openness problem leads to the claim that 
the equilibrium signaling costs are given by 

rx* = min{rx 2 , (1 + r)G(z)(xi - x 2 )}. (6) 

This equation leads directly to the following result. 

Proposition 5. When raiders are informed, the takeover mecha¬ 
nism can never lead to ex ante welfare losses. F.ven though welfare 
may not be monotonic in c, it is guaranteed that any finite r is prefera¬ 
ble to an infinite c. 

I’he proposition is proved in the Appendix. The intuitive justifica¬ 
tion is straightforward. Ex ante welfare reduction requires that the 
improvement v be, “on average,” small relative to the signaling costs. 
But if v is usually small, signaling costs as given by (6) will be small tot) 
with an informed raider. A bad firm will not have much incentive to 
raise its stock price because the probability of its being taken over at 
an inflated price is low. Consequently, good firms do not have to 
spend a great deal to credibly separate themselves. 

If we compare the results with informed raiders with those with 
uninformed raiders, we see that uninformed raiders are preferable 
over the cost region in which they lead to pooling outcomes since 
informed raiders always entail some signaling cost. Over the cost re¬ 
gion in which both types of raiders lead to separating outcomes, equa¬ 
tion (6) tells us that informed raiders are preferred. 12 

'* This preference for informed raiders is increased if we consider a perturbation of 
the model that eliminates the pooling outcomes in the uninformed raider case. Suppose 
we change the cost structure slightly so that the marginal cost of signaling for the bad 
firm is raised infinitesimally to r + e while that for the good firm is left at r. It is easy to 
show that with uninformed raiders we are left with a single separating equilibrium that 
satisfies the intuitive criterion, and that signaling costs are approximately equal to 
mining, (1 + r)G(r)(*| - *„)}• The results for the informed raider case are. on the other 
hand, approximately unchanged from eq. (6): both cases now feature signaling costs 
that increase smoothly with decreases in c, up to a limit of nt*. And over the range of 
increase, uninformed raiders always lead to the higher costs. 
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These relatively optimistic results for informed raiders may appear 
somewhat counterintuitive: one might have expected them to cause 
more problems than their uninformed counterparts. As it turns out, 
the resulLs are strongly dependent on the assumption that manager 
and shareholder interests coincide. When this assumption is removed, 
the conclusions can be reversed. - 

Suppose that, in addition to their return from the stock, managers 
also derive some further benefit from retaining control of the firm. 13 
We might write their utility function as U = Y + 0 C, where Y is the 
total proceeds from ownership of the firm's stock and C is an indicator 
variable that takes on the value one when managers retain control and 
the value zero when there is a raid that ousts management. The 
parameter 0 is a measure of how strong the desire for control is. 11 

In order to maximize expected utility, managers who value control 
will wish to take measures to lessen the probability of takeover. With 
uninformed raiders, this has no effect since, as we have seen, manage¬ 
rial signaling does not change the probability of takeover when raid¬ 
ers are uninformed. 

However, when raiders are informed, control-oriented managers 
of bad firms will have increased incentive to pass their firm oft to 
shareholders as being good since, by raising the price above what the 
raider knows is the fair value, they can lower the probability of a raid. 
Similarly, managers of good-stale firms who value control will be less 
willing to let their firms be unfavorably judged in the marketplace. 
Thus with informed raiders, managerial taste for control forces sig¬ 
naling costs up. In the polar case of 0 = *, when managers care only 
about control at the expense of the stockholders, good-state managers 
will always pay the full signaling cost of rx 2 for any level of r that 
entails G(c) > 0. 

This sort of logic implies that if managers value control sufficiently, 
the conclusions above are reversed: informed raiders become more 
problematic than uninformed ones, and any presumption that in¬ 
formed raiders can prevent the takeover mechanism from causing ex 
ante welfare losses must be abandoned. In this model, the takeover 
mechanism can exacerbate managerial moral hazard problems: in¬ 
stead of making slacking managers work harder, it may lead them to 
further waste the firm’s resources in an effort to remain entrenched. 

V. Implications of the Model 

This is certainly not the first formal model to suggest that the reac¬ 
tions of managers to takeover pressure can have undesirable effects, 

ls Perhaps it would be costly for them to search for new jobs that offer the same 
compensation and status as their current ones. 

M Baron (1983) employed a similar formulation. 
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even in the absence of managerial moral hazard. Baron (1983) ana¬ 
lyzed a model in which managers can refuse a takeover bid by a raider 
when it is made. He found that even if the decision to refuse a bid is in 
stockholder interests at the time it is made, the freedom to make such 
a decision can lead to ex ante welfare losses. In other words, stock¬ 
holders might be better off if managers could be bound to never 
refusing bids. 15 Shleifer and Vishny (1986) suggested similar conclu¬ 
sions regarding the paying of “greenmail” by target firms to potential 
raiders. Although they modeled greenmail as part of a subgame per¬ 
fect strategy of managers who act in the interests of shareholders, 
they noted that it is nonetheless possible that outlawing greenmail 
would lead to ex ante improvements in shareholder welfare. 

These arguments lend credence to the beliefs of people such as 
Easterbrook and Fischel (1981) who advocate “managerial passivity” 
rules that would prevent managers from reacting to takeover bids. In 
the view of these writers, the ability of management to interfere in the 
takeover process detracts from the basic virtues that takeovers confer 
onto society. I’heir prescription is a simple one: ban managerial activ¬ 
ism and give the takeover mechanism as free a reign as possible. 

The managerial myopia problem analyzed here is structurally very- 
similar to the bid refusal and greenmail problems mentioned above. 
Even when managers act in shareholders’ interests, the perfect equi¬ 
librium is ex ante inferior to what could be achieved if managers 
could be bound to never signaling. However, the policy implications 
of managerial myopia are very different. Bid ref usal, greenmail, and 
other defensive maneuvers that are undertaken at the time of a 
takeover bid are usually highly visible. Outlawing such forms of man¬ 
agerial resistance is a viable option since they are easily detected and 
documented. 

Managerial myopia, by contrast, is relatively invisible. It may take 
place behind the scenes in vast numbers of firms that are never sub¬ 
ject to a takeover bid; it may be very difficult to observe cleanly and 
even tougher to document in court. It is a consequence of two inevita¬ 
ble facts: that managers will tend to be better informed about the 
prospects of their firms and that they will have to Ire given some 
discretion in decision making. 

If managerial myopia is indeed a problem of serious magnitude 
and it cannot be simply banned, then some control of the takeover 
mechanism may be a second-best alternative. If this is not undertaken 
at the regulatory level, companies may wish to do it themselves by 


' This is due to adverse selection. If raiders are uninformed, they will be less tikclv to 
bid for targets in which the management can refuse the bid. They fear that the only 
bids that will be accepted will be those for which the target management knows that 
they have overbid. 
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enacting antitakeover amendments in their corporate charters— 
exactly the sort of managerial activism that Easterbrook and Fischel 
oppose. While empowering managers to block takeovers might be 
undesirable in a world with no myopia (i.e., when the Baron analysis 
applies), this is no longer true when the forces described in this paper 
are present. 

What evidence is there concerning the managerial myopia hy¬ 
pothesis? A great deal of the empirical literature on takeovers has 
implicidy ignored the very possibility of its existence by (a) only focus¬ 
ing on the stocks of companies directly involved in takeover bids and 
(b) using the maintained assumption that the market price always 
reflects the full information value of the firm; that is, that there are no 
informational asymmetries among managers, raiders, and sharehold¬ 
ers. A typical approach is to perform an event study and note that the 
stock of a target appreciates strongly with a bid, while that of the 
acquirer tends to change less significantly. The conclusion then drawn 
is that the takeover mechanism must be creating new value and that 
target firms wind up capturing most of the surplus generated. 16 

Such a conclusion is erroneous on two counts. First, whether 
takeover pressure makes managers work harder or makes them be¬ 
have more myopically, it must be true that a lot of the action, for 
better or for worse, occurs rather invisibly in companies that are 
never actually subject to bids. Second, taking market prices to reflect 
full information values almost amounts to assuming the synergy hy¬ 
pothesis. Of course, target prices appreciate with a bid. But this does 
not necessarily imply that the target becomes more valuable. It is also 
possible that it was underpriced before the bid and has simply come 
closer to being priced correctly. The managerial myopia hypothesis 
presumes that such deviations in prices from their complete informa¬ 
tion values play a role in takeover activity. This presumption is consis¬ 
tent with the findings of Bradley (1980), who observed that, after 
unsuccessful bids, the stocks of target firms tend to retain much of the 
price appreciation that they realize during the course of takeover 
attempts. By the very act of making a takeover bid, a raider seems to 
communicate some positive new information concerning the value of 
its target. 

Other empirical evidence has sometimes been used to argue di¬ 
rectly against the managerial myopia hypothesis. However, on closer 
examination, this evidence often appears ambiguous in its implica¬ 
tions. For instance, a study by the Office of the Chief Economist of the 
Securities and Exchange Commission (1985) found that firms with 
low R 8c D expenditures are not taken over less frequently than those 


16 See Jensen and Ruback (1983) for references to this literature. 
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with higher R & D spending. On the surface, this seems to indicate 
that myopic behavior does not prevent raids. But such a conclusion 
ignores a significant sample selection problem: as the model of this 
paper suggests, low R & D should be observed in those firms for 
which the ex ante probability of takeover is the highest. Thus even if 
this myopia is deterring some raids, more raids may still be observed 
among myopic firms than among nonmyopic ones, which face a lower 
ex ante probability of takeover. 

Also cited in the myopia debate are the findings of McConnell and 
Muscarella (1985), who observed that stock prices (except in the oil 
industry) respond positively to announcements of increased invest¬ 
ment expenditures. Jensen (1986, p. 11), extolling the virtues of the 
takeover mechanism, noted that this observation is “inconsistent with 
the notion that the equity market is myopic.” While this is correct, it 
misses the point. The McConnell-Muscarella observation is consistent 
with the notion that managers are myopic: the more reluctant man¬ 
agers are to invest, the higher will be the present value of those few 
projects that they do find sufficiently attractive to undertake and, 
hence, the more positive should be the market reaction to the an¬ 
nouncement of a new investment. 17 

On the other side of the argument, an interesting empirical point 
has been raised by Linn and McConnell (1983). They studied the 
reaction of stock prices to the adoption of antitakeover charter 
amendments, such as supermajority provisions, which give manage¬ 
ment much greater power to block undesired takeovers. As noted 
earlier, the model of this paper implies that such amendments should 
improve the value of the firm as long as manager and shareholder 
interests coincide. Managers need not waste resources to deter un¬ 
fairly low bids if they can simply turn them down. And indeed, Linn 
and McConnell found that share prices respond positively to the pas¬ 
sage of antitakeover provisions. 

Of course, these results are also subject to more than one interpre¬ 
tation and do not “prove” the existence of managerial myopia. Linn 
and McConnell noted that they are also consistent with shareholders 
who believe that antitakeover provisions put management in a better 
position to bargain with raiders on their behalf. 18 A third possibility is 
that there is asymmetric information, and the adoption of the provi- 

17 Jensen himself made this distinction between myopic managers and myopic stock 
prices. However, his arguments suggest that what he thinks is relevant to the takeover 
debate is the latter, not the former. 

1 Actually, stockholder perceptions may not always be correct in this case. As Baron 
pointed out, the freedom to bargain with raiders, while ex post desirable for good 
firms, could lead to ex ante expected losses if raiders are uninformed and can be 
adversely selected against. The raiders will be reluctant to attempt takeovers and there 
will be less synergistic surplus available. 
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sions causes the market to revise upward the probability it attaches to 
a possible takeover attempt. While this evidence is thus somewhat 
ambiguous in its support of the managerial myopia hypothesis, it 
seems no less so than that used to argue against it. 19 Furthermore, no 
matter how they are interpreted, the Linn-McConnell findings cast 
doubts on the claims of Easterbrook and Fischel and others whose 
arguments implicitly rely on symmetric information and who con¬ 
clude that no interference with the takeover mechanism should be 
tolerated. 


Appendix 


Proof of Proposition 3 

This can be demonstrated by an example. I^et r ~ 0 and v be nonstochastic 
and less than (1 + r)(x t — ■**). Consider a situation in which the good firm 
separates by selling a very small amount of oil, at a total cost of e. 

This situation is an equilibrium: the bad firm would not wish to spend c to 
be judged good by the market because the probability of takeover by an 
informed raider of an overpriced firm is zero (since t he improvement v is less 
than the excess over fair value a raider has to pay to acquire a bad firm at a 
good-firm price). On the other hand, the good firm is willing to pay t to be 
judged good since it faces a takeover probability of one. 


Proof of Proposition 4 

Suppose that a pooling equilibrium did exist. In such an equilibrium, a raider 
can take either type of firm over for a price P such that 

(1 + r)[px y + (1 - p)xf\ < P < (1 + r)*|. (Al) 

That is, a raider has to pay more than the ex ante expected value of the firm 
because a takeover bid by an informed raider communicates some informa¬ 
tion to previously uninformed shareholders, causing them to revise upward 
their valuation (see Grossman and Hart 1981). 

The probability that a good firm gets taken over in a pooling equilibrium is 
thus G(zj), where zj — c + P — (1 + r)xj. The probability that a bad firm gets 
taken over is G( Z 2 ), where z 2 = c + P- (l + r)x 2 . If a good firm could 
separate itself as good, it would thus gain an expected amount G(z t )[(l + 4*i 

- P] over what it earns in the pooling equilibrium. If a bad firm could be 
judged good, it would gain an expected amount [G(z)(l + r)(x ( - x 2 )] 

- {G(z 2 )(P - (1 + r)x 2 J} over what it earns in pooling. (Recall that z is defined 
as z = c + [1 + r][xi - x 2 ].) It is straightforward to verify that the good firm’s 
gain always exceeds the bad firm’s. Thus the intuitive criterion rules out such 
a pooling equilibrium. 


19 A sharper test might be to examine a measure of managerial farsightedness such 
as capital or R & D expenditures before and after the adoption of antitakeover provi¬ 
sions. 
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Proof of Proposition 5 

The ex ante gains due to the takeover mechanism are given by 

G{c)E(v — civ a c) - prx*. ( A2) 

It follows direcdy from the properties of conditional expectation and the 
fact that z > c that (A2) is greater than or equal to 

G{z)E(v - c/v St) - prx*. (A3) 

Since E(v - civ z) == z - c = (1 + r)(x x - x 2 ), it must be that (A3) is greater 
than or equal to 

fJ(z)(l + r)(x t - x 2 ) “ prx*. (A4) 

Given the value of rx* in equation (6) of the text, it is now clear that (A4) is 
positive. Thus the ex ante gains due to the takeover mechanism can never be 
less than zero when raiders are informed. 
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prices, which is in turn bound to the cash price (spot or forward) by 
the need to carry stocks through time. Of particular interest in this 
regard are conditions under which the amount of futures that short 
hedgers wish to sell will exceed the amount that long hedgers wish to 
buy: that is, hedging is net short, with the expected value of the 
futures price exceeding its current quotation (an expected rise in the 
futures price), or what has become the standard use of that venerable 
old term, “backwardation." 1 

The purpose of this paper is to derive theoretically sufficient condi¬ 
tions for backwardation under a framework with the following char¬ 
acteristics not generally found in the literature. First, ours is an ex¬ 
plicit treatment of a true futures market as opposed to a forward 
market. Anderson and Danthine (1983) (see also Feder, Just, and 
Schmitz 1980) provide a detailed treatment of long and short hedging 
in forward markets, where “perfect" hedges occur, with the emphasis 
being placed on the kinds of uncertainties faced by different traders 
and on the limited range of forward markets available to use in trad¬ 
ing away such uncertainties. In contrast, the results of the present 
paper hinge crucially on the fact that true commodity futures markets 
provide only “imperfect” hedges because a range of delivery alterna¬ 
tives are available to the seller of a futures contract. Our explanation 
for backwardation rests on well-known institutional features of fu¬ 
tures contracts. Second, our approach is in contrast to works that rely 
on informational asymmetries (Danthine 1978), differences in at¬ 
titudes toward risk, or essentially ad hoc limitations such as hedging 
by short hedgers only (Danthine 1978; Baesel and Grant 1982) in 
order to explain the existence of backwardation. It also provides an 
alternative to the argument that backwardation can result when fu¬ 
tures contracts provide poor consumption hedges (Richard and Sun- 
daresan 1981). 

Backwardation is an important topic since its presence would en¬ 
sure long-run speculative profits. The relationship between backward- 


1 The term “backwardation" has been used in a variety of ways relative to expected 
and current spot, forward, and futures prices. Popular use of the old trading refer¬ 
ences, backwardation and contango, in the theoretical literature began with Keynes 
(1930). In his well-known development of the theory of the risk premium, he re¬ 
peatedly made use of the original definitions of backwardation, a situation in which the 
current spot price exceeds the current forward price, and contango, the reverse situa¬ 
tion. One frequent use of the term backwardation comes from Keynes’s reference to an 
excess of the expected spot price over the current forward price under “normal" supply 
conditions. While he named this excess the "risk premium," it has subsequently come to 
he called Keynesian “normal backwardation." This use of the term backwardation has 
been carried into the analysis of futures markets as opposed to the markets Kevn« 
{(miwrd rcattett. As in this paper, the term is currently used to describe.' 
situation m which the cuntnl fitlum ’pntt ft * dnwwrasdtoued estimate ° ' ? 

value at the maturity of the contract ) 
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1 and speculative profits can be neatly summarized with the aid of 
re 1. Let F t) and F x be the current and later futures prices, respec- 
y. Speculators (those participants with absolutely no interest in 
physical cash commodity, per se, and hedgers who are involved in 
res beyond the requirements of their cash position) will demand 
res contracts when the futures price is expected to rise (EF X > F 0 ). 
( ever, speculators supply contracts when the price is expected to 
[EF 1 < F 0 ). This is reflected in the speculative demand curve, D.s. 
speculative supply curve, S*. There also is a demand for futures 
rads for long hedging purposes, D ft , and a supply of contracts for 
f hedging purposes, S H . 

backwardation equilibrium results when the market-clearing fu- 
s price is below the expected later value of the contract. This is 
led ( N*, Ft) in figure 1. Should such an equilibrium characterize 
futures market, speculators earn long-run profits by simply buy- 
futures contracts. The requirements for such an equilibrium are 
short hedging > D H ) at F 0 = EF X and speculative demand of 
than perfect elasticity. This paper examines arguments concent¬ 
re former (the latter requirement receives impeccable treatment 
•ootner [I960]). In particular, the focus will be on sufficient condi- 
s for net short hedging and backwardation derived from an idea 
offered by Houthakker (1957, 1968). 2 

rom a historical perspective relevant to the developments by Houthakker, two 
1 can be isolated regarding the behavior of commodity futures prices. One view is 

I' 


T 
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One of Houthakker’s arguments for net short hedging rests on the 
notion that the correlation between cash and futures prices depends 
on the stocks of the commodity. 3 When inventories of a commodity 
are large, cash and futures prices for the commodity tend to be highly 
correlated, while at low inventory levels, cash and futures prices are 
less highly correlated. We refer to price behavior that depends on the 
stock cycle in commodities as the “inventory effect.” 

A high correlation between cash and futures prices tends to en¬ 
courage both long and short hedging because it makes the futures 
contract a more effective hedging instrument. Large inventories tend 
to occur just after the harvest. Houthakker noted that typically the 
basis (futures price minus cash price) narrows from the harvesttime 
until several months later. This favors short hedgers, who have 
bought cash and sold futures, and works to the disadvantage of long 
hedgers. Hence, during a period from harvest until several months 
later, short hedging tends to dominate long hedging. Again, typically, 
from near the middle of the crop year until near the harvest, the basis 
widens. This benefits long hedgers at the expense of short hedgers, 
but by this time inventories have become somewhat depleted, which 
weakens the correlation between cash and futures prices and acts to 
discourage both long and short hedging. Thus Houthakker arrived at 
an argument for backwardation as a seasonal phenomenon occurring 
during the period when inventories are large and the basis is narrow- 


that there is no trend in futures prices since expectations are brought to bear equally on 
all commodity prices, cash (spot and forward) and futures (Hawtrey 1940; Working 
1948, 1949. 1953a, 19534; Teiser 1958, 1960, 1967). This is due to the connection over 
time provided by carryover stocks from harvest to harvest. Other works tending to 
support this view include Gray (1961), Rockwell (1967), and Dusak (1973). Another 
view admits the possibility of seasonal trends in the futures price, based on the behavior 
of hedgers relative to stock levels over the harvest cycle (Houthakker 1957, 1968; 
Brennan 1958; Cootner 1960, 1967). These two basic views spawned the well-known 
debate in the literature on futures markets over backwardation and the existence of 
long-run speculative profits from a simple strategy of buying futures. The two views 
stem from much early work on futures markets. An early point of theoretical conten¬ 
tion was whether or not futures prices should rise throughout the duration of the 
contract, especially when there are stocks in excess of the level required to maintain 
production at normal levels. In a futures market (see Keynes [1930) and Hicks [1946] 
for forward markets), Dow (1940) argued that commodity grades in production at any 
given point in time are not perfect substitutes for the grade deliverable on the futures 
contract so that long hedgers will not be able to hedge all the risks that confront them. 
For an in-depth analytic description of this early literature, see Fort (1985). 

* Houthakker gives two arguments as to why an excess of short hedging will domi¬ 
nate commodity futures markets. In the one not addressed here, he stresses the asym¬ 
metry of price arbitrage between long and short hedgers. While short hedgers face 
limited risk because the futures price cannot exceed the cash price by more than 
carrying charges, long hedgers have no such protection. This limited risk situation 
encourages short hedging relative to long hedging (Houthakker 1968, pp. 196-97). 
This conjecture concerning asymmetric arbitrage is analyzed in Lien and Quirk (1984), 
where it is shown that it holds only under rather restrictive conditions. 
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ing. During the later period of the crop year, when the basis is wid¬ 
ening, there may be a contango (an expected fall in the futures price) 
or not depending on the relative strengths of the basis-widening ef¬ 
fect and the low correlation effect. 

We strive for an interpretation divorced from the anticipated basis 
movements that play such a large role in Houthakker’s argument. 
The problem that we see with introducing anticipated basis changes 
into a theoretical account of backwardation is that this produces a 
“grin without the cat” theory. Short hedgers buy cash and sell futures 
in the immediate postharvest period because they believe that over 
the very near term less short hedging (relative to long hedging) will 
occur. But there is no explanation offered as to why this should be the 
case. Thus the anticipations aspect of the Houthakker theory explains 
the dominance of short over long hedging today as due to anticipa¬ 
tions that short hedging will not be so dominant tomorrow, but (out¬ 
side of further anticipations) why this should be so is outside the 
theory. 

The link between net short hedging and this pattern of price corre¬ 
lation is that large inventories tend to be associated with low cash 
prices. 4 Since short hedgers endeavor to avoid the risks associated 
with low cash prices and the correlation between cash and futures 
prices is large with large inventories, the futures contract offers a 
desirable instrument for their purposes. On the other hand, long 
hedgers try to avoid the risks of high cash prices, but the low correla¬ 
tion between cash and futures prices during periods when cash prices 
are high limits the effectiveness of the futures contract for long hedg¬ 
ing purposes. The result is that net short hedging characterizes the 
futures market. 5 

The objective of this paper is to spell out in some detail just how an 
inventory effect can be specified and its implications. Ultimately, the 
argument is based on the flexibility of futures contracts (as contrasted 
with forward contracts) and the lack of perfect substitutability among 


4 Cash prices, of course, are determined by quantities of a commodity demanded and 
supplied, not by inventories. But, given inlertemporally stationary demand and assum¬ 
ing no carryover from one harvest to the next, a rational expectations equilibrium 
implies that quantities supplied to the market are directly related to inventory levels. 
Large releases to the market are required when inventories are large in order to ensure 
that price will rise sufficiently over the interharvest period to cover carrying costs. 

In a similar vein, Cootner (1960) argued that, when inventories arc small, short 
hedging will be light, and if the output commitments of long hedgers are large, then 
hedging can be net long. The time when inventories are likely to be small is just before 
the harvest. Hence, Cootner expected a falling futures price until hedged inventories 
reach their peak (i.e., stocks in commercial hands are at their peak) and a rising price 
only after this peak. He concluded that the requirements for a rising futures price may 
not hold over a substantial portion of the duration of some contracts; there can be a 
period of net long hedging. 
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the various commodity grade-location options available for delivery 
under the futures contract. Our approach is as interesting as the 
question of backwardation itself since conditions under which a back¬ 
wardation equilibrium will exist have received little rigorous treat¬ 
ment in the general way that we propose. 

In Section 11, we stress the important differences between forward 
and futures contracts. Our model of a true futures market is in Sec¬ 
tion III. In Section IV, we derive conditions under which an appro¬ 
priately specified inventory effect can produce a backwardation equi¬ 
librium, even when short and long hedgers have the same probability 
beliefs and the same utility functions over profits. In this derivation, 
w'e find that Houthakker's original specification of an inventory effect 
falls short from the viewpoint of expected utility maximization. Con¬ 
clusions round out the paper. 


II. True Futures versus Forward Markets 

A fundamental difference between commodity futures contracts and 
forward contracts is the flexibilitv provided to sellers (promising de¬ 
livery) under the former.* 5 For a wheal futures contract, the seller has 
the choice of the date during the delivery month to actually make 
delivery, the grade of w'heat to actually deliver (at set penalties or 
premiums for nonstandard grades), and the delivery location itself 
(from a set of locations available under the contract). While this flexi¬ 
bility of delivery terms is essential for avoiding cornering problems 
and thinness of markets, it also creates uncertainty as to delivery 
terms for the buyer of a futures contract. For this reason, delivery 
rarely takes place under commodity futures contracts; cash contracts 
specifying delivery conditions in fine detail are typically used when 
the actual transfer of a commodity is contemplated. 

This flexibility on the seller’s side provides certain arbitrage rela¬ 
tions that characterize the joint probability density function (pdf) be¬ 
tween cash and futures prices at the delivery date. Because the seller 
of the contract has the choice of the grade-location combination to 
deliver, delivery of the lowest-priced alternative will occur should 
delivery become a reality. This ensures the following important arbi¬ 
trage condition. 

Assume that there are two cash grades, one delivery location under 
one futures contract, and both grades deliverable with no penalties or 

6 There are other differences between forward and futures contracts, including the 
lack of well-developed competitive markets for forward contracts, the fact that the 
profits and losses on futures contracts are paid out on a daily basis, and that a clearing¬ 
house acts to protect futures traders from nonperformance, while no such protection 
exists in the case of forward contracts. 
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premiums in a two-period model denoted by subscript t = 0, 1. The 
prices of the two deliverable cash grades at time 1 are C{ and C*. The 
futures price at time t is denoted F,. The following arbitrage relation¬ 
ship plays an important role in the derivation of the joint pdf of cash 
and futures prices: 

F\ = min(G'{, C'f). (1) 

While set in a two-grade, single delivery location context, a more 
general condition is obvious. Relation (1) provides a basic rationale 
for the inventory effect and for Houthakker’s explanation of back¬ 
wardation. One way to state his idea of an inventory effect is that at 
low cash prices, the various grade-location options deliverable under 
a futures contract are closer substitutes for one another than at high 
cash prices because low cash prices occur when inventories of all 
delivery alternatives are large, and at such times, it is the common 
properties of the alternatives rather than their differences that deter¬ 
mine their prices. Because the cash prices are more highly correlated 
with one another at low than at high cash prices, this implies from (1) 
that any cash price is more highly correlated with the futures price at 
low than at high cash prices. 

It is instructive to consider the case of a forward market, in which 
perfect hedges are possible and in which the inventory effect is ab¬ 
sent. If the futures market is a forward market defined by the absence 
of flexibility in delivery, then, in effect, the prices of the two cash 
grades are identically equal to each other at time 1. As a result, the 
cash and futures, or more properly spot and forward, markets “come 
* together” at time 1: the forward market version of arbitrage condition 
t (1). The difference between the time 1 spot and forward prices is 
f zero, and there can be no effect on this zero price difference due to 
changing inventories. Adhering to a forward market analysis can tell 
) us nothing about an important aspect of functioning futures markets 
: since the joint pdf degenerates in this case. 

£ 

III. Short and Long Hedging 

Our model of a futures market in a two-period framework is this. 
Two types of hedgers are present in the futures market: elevator 
operators and millers. Elevator operators buy cash wheat today (in 
period 0) and store it for sale in period 1. To the extent that they 
hedge, elevator operators are short hedgers selling futures contracts 
to offset their long positions in the cash market. 

The operation of long hedging by millers has been described in 
detail by Working (19536). Millers make bids on flour contracts w ; ith 
flour users such as bakeries. For large flour contracts, wheat require- 
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ments are difficult to satisfy through immediate purchases in the 
oftentimes thin cash markets. As a consequence, the miller buys 
wheat futures at the time of a successful bid for a flour contract. As 
wheat is purchased over time to meet milling requirements, cash pur¬ 
chases are offset by corresponding sales of futures. Gradually, the 
initial long hedge is terminated. Millers thus are long hedgers who 
buy futures contracts to offset their short positions in the cash market. 

We are interested in establishing the result that backwardation can 
emerge in a true futures market given an appropriately specified 
inventory effect, without resorting to assumed differences between 
long and short hedgers concerning attitudes toward risk, informa¬ 
tion, or size of cash commitments. Consequently, we will model the 
futures market as one in which there are, say, N elevator operators 
and N millers. All participants have identical utility functions and the 
same joint pdf's over cash and futures prices. We will also assume that 
they are all located at a common delivery point and deal in grade 1 
wheat, with the grade-location combination an admissible delivery 
alternative under the futures contract. 

Short hedgers (superscript S) are involved in productive transfor¬ 
mation of the cash commodity. 7 A cash commodity revenue function 
R(y s ) represents returns strictly related to activities concerning the 
commodity that are independent of changes in cash or futures prices, 
for example, commodity “grading” by elevator operators. The func¬ 
tion R(y s ) will be assumed strictly concave; that is, R( 0) = 0, R' > 0, 
and R" < 0. Elevator operators commit themselves to carry an amount 
of the cash commodity y s between times 0 and l, earning Ci - C ( * - k 
per unit carried, where k represents known costs of storage. They 
attempt to reduce the risk of changes in the value of their holdings by 
selling futures, earning ( F 0 - Fi)Q s on their futures position, Q s . 
Under the assumption that the commodity is perfectly nonperishable, 
the following expression defines the sum of production and futures 
trading revenues for short hedgers: 

V s - (C{ - C‘, - k)y s + R(y s ) + (F« - T.)Q S . (2) 

Long hedgers (superscript L) have a cash commodity revenue func¬ 
tion, R{y L ), that describes profits from the milling operation itself. It 
is assumed that the price quoted to the bakery on the flour bid (con- 


7 A more descriptive model of futures trading would include a multiplicity of con¬ 
tracts and an extended time period. Anderson and Daruhine (1981) allowed for a 
multiplicity of contracts but under the special case of mean-variance analysis. Lien and 
Quirk (1984) applied a rational expectations framework to a T-period, single-contract 
model similar to the one developed here. They found that the futures market became a 
forward market in all periods prior to T - 1, indicating that one might just as well 
examine a two-period model. 
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verted to dollars per bushel of wheat) is the cash price of wheat at time 
0 plus carrying costs, plus the markup from milling, R(y L ). Payment 
for the cash commodity occurs at time 1. For this two-period model, 
all cash purchases of wheat by the miller have been telescoped into a 
single purchase at time 1, at which time the initial long hedge is 
terminated by an offsetting sale of futures contracts. For symmetry, 
we will assume that the R(y L ) function is identical to the R(y s ) func¬ 
tion. Millers attempt to reduce the risk of changes in the cash price by 
buying futures, earning (F 1 - Fo)Q?' on their futures position, Q L . 
The following expression defines the sum of production and futures 
trading revenues for the long hedger: 

V L - (Ch + k - C\)y L + R(y') + (F, - F 0 )Q L . (3) 

In order to incorporate these revenue functions into an expected 
utility framework, one must first derive the joint pdf between cash 
and futures prices. With the arbitrage relation in (1), the joint pdf 
over the futures price and the grade 1 cash price is 

’ 0 for F| > C\ 

h(F\,C\) = F f(C\,C‘i)dC\ for F, = Ci (4) 

Jc'l 

•f(C\, F|) for Fj < C|, 

where/(C{, C‘f) is the joint pdf over the time 1 cash prices. An exactly 
symmetrical story can be told for a joint density between the futures 
price and the grade 2 cash price. Since all participants are assumed to 
deal in grade 1 wheat, henceforth, let it be understood that C\ and Co 
stand for the grade 1 cash prices at times I and 0, respectively, in 
order to ease the notational burden. 

Houthakker argued that, because of the existence of multiple deliv¬ 
ery alternatives under the futures contract, the joint density in (4) 
should be characterized by a high correlation between C 1 and F\ at 
low values of the cash price and by a low correlation between these 
prices at high values of the cash price. The result is net short hedging 
and backwardation (£Fi > Fo). In our treatment, an inventory effect 
must be interpreted in a somewhat different manner. On the basis of 
our expected utility maximization model, it appears that both cash 
and futures prices must be “low” in order for an inventory effect to 
generate backwardation. Further, in an expected utility framework, 
one finds that partial correlation coefficients aggregate price move¬ 
ments at a level that is too coarse for the purpose of deriving sufficient 
conditions for backwardation. 
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IV. Normal Backwardation and the Inventory Effect 

In the case of a true futures market in which perfect hedges are not 
possible, there is a nondegenerate joint pdf over the cash and futures 
prices. To identify the role of the inventory effect in generating a 
backwardation equilibrium, it is convenient to postulate an initial 
martingale equilibrium with A(/j, C t ) symmetric about £Gj and EF ( , 
equal cash commitments by long and short hedgers (y s = y' ), and 
equal hedging by long and short hedgers (Q s = Q l ). Then we exam¬ 
ine the effect of perturbing the equilibrium. 

In the case of a nondegenerate joint pdf, the objective function for 
the short hedger becomes 

EU S = u{V s )h(Fi.C x )dC x dF x + u{V s )h*(C } )dC u (5) 
Jo )F, JO 

where h*( Cj) is the pdf holding when is the minimum cash price, 
that is, C'i as E t . The first term is expected utility occurring when C j is 
not the minimum cash price. For elevator operators, the first-order 
conditions with respect to y s and Q s (respectively) are 


EUlj = jT JJ u'(V‘ s )(C| - Co - k + R')hdC\dF\ 

+ P u'(V s )(C, - Co ~ k + R')h*dC\ = 0 , 
Jo 


( 6 ) 


EUq = f" f «'<V*>tF 0 - F\)hdC\dF\ 

Jo Jr, 

( 7 ) 

+ u'(V s )(Fo - C,)A*dCj = 0. 

Jo 


These first-order conditions characterize the short hedger at an initial 
martingale equilibrium. 

Integration of the short hedger’s first-order conditions by parts 
yields 
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EUq = - f r H(F X , C x )[u”y s {F 0 - F x )]dC x dF x 
Jo Jf, 

+ f"° u'(V s )(Fo - C,)h*dC, = 0, 

JO 


9 l 


(9) 


where 


H(F,,Ci) = f 'h(F t ,x)dx, 

Jr, 

H*(Ci) = f C ‘ h*(x)dx. 

Jo 

With an appropriately chosen perturbation of the density h, a per¬ 
turbation that incorporates an inventory effect as described presently, 
we will show that the volume of short hedging increases, while the 
volume of long hedging decreases, at the original martingale equilib¬ 
rium. This means that the market-clearing price F 0 must fall (see fig. 
1). Moreover, we can choose our inventory effect perturbation in such 
a way that EF X is unchanged. Since F 0 = EF X at the initial martingale 
equilibrium, the new equilibrium will exhibit backwardation, that is, 
Fo < EFj. 

We proceed to this end as follows. First, we generalize the form of 
the joint pdf between cash and futures prices to incorporate a shift 
parameter. This allows us to impose an inventory effect directly onto 
j the joint pdf. Second, we discuss the comparative statics of changes in 
;.cash commitment and hedging with respect to imposition of the in- 
Lventory effect. Since we want net short hedging, finding the sign of 
|the total derivatives of y and Q with respect to our inventory effect 
|perturbation of the joint pdf h is the primary exercise. At this stage, 
|we show that an inventory effect perturbation of the joint pdf helps to 
jbring about increases in short hedging and decreases in long hedging. 
|The result is that the futures price, F 0 , must fall as short hedging 
jbicreases relative to long. Then, all that remains is to show that EF j 
Remains unchanged when we perturb the joint pdf. The result of the 
nventory effect perturbation will then be a backwardation equilib¬ 
rium, F 0 < EF X . 

I Notationally, perhaps the simplest way to express things is to view h 
i a function of a shift parameter a as well as of F x and C x . that is. 


h(Fu C,, a) = h(F u C,) + o 6(F X , C,). 


( 10 ) 


'here a is simply a shift parameter; for example, h(F x , C x , 0) = h(F i, 
;i). We assume that (a) 0(F,, C,) satisfies 
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fTe(F 1 .c, ) rfc,tf, + re(c,.c,)rfc, = o, 

Jo Jr, Jo 

( b ) h(F j, Cj) = 0 implies 0(Fj, Cy) = 0, and (c) a is restricted so that 
h(F u C u a) a 0 for all (/■'„ Cy). 

Note that perturbing the joint pdf h(Fy, Cy) is equivalent to impos¬ 
ing small changes on the joint pdf h(F x , C\, a) through small changes 
in a and evaluating the changes at a = 0. Further, note that dhlda, 
evaluated at a = 0, is 9(F it Cy). 

Basically, we want to show how an inventory effect derived from a 
perturbation of the joint pdf encourages short hedging relative to 
long; that is, dy s /da and dQ 5 /da are both positive and a is consistent 
with our specific inventory effect. Performing comparative statics on 
the first-order conditions in (6) and (7) using the "shiftable” form of 
the joint pdf, we have 


dQ s 

(EU* y )(EUZ, a ) - (EU s yQ )(EU* a ) 

(ID 

da. 

A 

dy s 

(EU^)(EU* a ) - (EU^)(EU^) 

(12) 

da 

A 


where A = (EUy y )(EUQQ) - (EU y Q)~. Note that A > 0 at a regular 
maximum. Further, second partials with respect to y and Q are nega¬ 
tive for strictly concave utility. Since we are after conditions under 
which the total derivatives (11) and (12) will be positive, the signs of 
the cross-partial terms remain to be examined. 

The inventory perturbation plays its role in the cross-partials with 
respect to a. We will need the inventory effect perturbation as dis¬ 
cussed above and an appropriate specification of what is meant by 
"small” and “large" values of C j and Fy. First, the cross-partials with 
respect to a are 

EU* a « -f I** H a [u"y s (Cy - C 0 ~ k + R') + u']dCydFy, (13) 
Jo Jr, 

EUL = - r r H a [u"y S (Fo - Fy)]dCydFy. (14) 

Jo Jr, 

From (13), note that the term inside the brackets is negative (positive) 
when Cy ? C« + h — R’ + (1/py), where p = - ( u"!u') is the coefficient 
of absolute risk aversion. Similarly, from (14), the term inside the 
brackets is negative (positive) when Fy ? Fa- These relationships give 
us the specification of large and small prices required for the inven¬ 
tory effect. It is an interesting discovery that there are requirements 
on both cash and futures prices. In terms of our model, Houthakker's 
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original idea that cash prices high or low, alone, might drive back¬ 
wardation falls short. 

We are now ready to specify an inventory effect perturbation of the 
joint pdf so that both of the cross-partials are nonnegative and strictly 
positive if the perturbation involves strict inequalities over some inter¬ 
vals of positive probability measure. This effect will give the result 
that the probability that F\ is “close” to C| is increased for small values 
of Ci and F\ while this probability decreases for large values of the two 
prices. Let dh/da be denoted h a (F t , C,). Note that (a) h*(C x ) - 0(Cj, 
Ci), (b) H a (Fy, Ci) = /£,'6(Ti, x)dx, and ( c ) H*(C X ) - /&' 0(x, x)dx. 
Consider the case in which (a) H a (Fy,<x>) = H a (Fy,Fy) = 0 for every F, 
and ( b ) H%(C i) = 0 for all Cj. Using the specification of small and 
large found in (13) and (14), choose H a (Fy,Cy) == 0 when both/j and 
C| are small (Fi < F 0 and Cy < C 0 + k - R' + [1/py]) and H a (F\, Cj) 
< 0 when both Fy and Ci are large (F| > F 0 and Cj > C 0 + k - R' 
+ [ 1/py]) with strict inequalities over intervals with positive probability 
measure for some terms. Note that this inventory effect follows the 
dictates of expected utility maximization since we require that both 
prices, Ci and F x , are small and large together in specifying the per¬ 
turbation. With this specification of large and small prices and a per¬ 
turbation of the joint pdf that induces an inventory' effect, we also get 
the cross-partials with respect to a in (13) and (14) nonnegative and 
strictly positive over some intervals of positive probability measure. 
Thus we characterize an inventory effect as follows: 

Pr{Cj - F, < €|(Cj, Fy) G R}> Pr{Cj - Fy == e|(Cj, Fy) G 7}, (15) 

where R = [0, S) and T = (S, »]. The cutoff level for cash and f utures 
prices is dictated by utility maximization: choose S so that Fy - F« and 
C| - Cj, + k - R’ + (1/py). 

I he effect of the perturbation can be seen from figure 2. By the 
basic arbitrage relation (1), Fy s Cy. Given a small value of Fy, the 
dashed curve in figure 2 a shows the cumulative density function (cdf) 
// plotted against Cj. The solid curve shows the portion of the per¬ 
turbed cdf, which we call G in the figure, that differs from H follow¬ 
ing an inventory effect perturbation. The curve G is greater than H at 
low values of Cj, indicating that the probability that C| is close to Fy is 
greater following the perturbation. Similarly in figure 2 b, the per¬ 
turbed cdf G is less than H at high values of C) so that the probability 
that Cj is dose to Fy falls following the perturbation. 

Our inventory effect perturbation signs some of the terms in (11) 
and (12), but we also need to worry about the remaining cross-partial 
derivative given by 
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utility of hedging is an increasing function of the cash commitment of 
the trader so that the expression in (16) is positive. 8 

The end result, referring back to (11) and (12), is that dQ s /da and 
dy s /da are both greater than zero; short hedging and the size of the 
short hedger’s cash commitment increase. This follows since all terms 
in (11) and (12) are positive except the second partials with respect to 
Q and y, which are negative. Applying the same perturbation to the 
long hedger’s first-order conditions results in a decrease in both his 
hedged inventory and the size of his cash commitment. Thus the 
specified inventory effect perturbation of an initial martingale equi¬ 
librium with equal initial cash commitments by short and long hedg¬ 
ers induces an excess of short over long hedging at the old equilib¬ 
rium price. The result is a decrease in the market-clearing futures 
price, F n , given that speculative demand (supply) functions are of less 
than infinite elasticity. 

All that remains is to show that EF\ remained unchanged and a 
backwardation equilibrium will result from the inventory effect per¬ 
turbation imposed on the joint pdf between cash and futures prices. 
That the perturbation has no effect on Et\ can be seen by taking the 
following derivative: 

= j"£ F\h a (F\, C\)dC\dF\ + J* F\h${F\)dF\ 

= f*V,[// a (F,, oc) - H a {F>, F,)]dF t =0 11 0 

Jo 

since H a (F\, «) = H a (F lt Fi) = 0 for every F\ and //*(G'i) = 0 for all 
F i by hypothesis. Since the original joint pdf was symmetric, the per- 


H In the case of a forward markel, EU^ > 0 as long as is dose enough to f.’„ + k - 
R' since if F„ - (C„ + k ~ R') = t, then can he written as 

-JVfV’ Vo - Ctfh^C^dC, - t S u"{V s )(F n - C,)h*(C,)dC, 

(integration from zero to infinity), which is positive for e sufficiently small in absolute 
value. Things are more restrictive in the case of a true futures market. Let <f>(F'i) lie 
defined as 

/ u"(V s )(C, - C 0 - k + R')h(F t , C t )dCi 

^integration from F, to infinity) so that EU*q now becomes J (F,> — Fy )<t>{F,)dF, (inte¬ 
gration from zero to infinity) plus the expression aljovc that holds for the forward case, 
d a is characterized by constant or decreasing absolute risk aversion and EC\ s, + k 
r R . then 4> > 0. Signing EU*q requires further conditions on the derivative <f>'(F']). In 
“►articular, a sufficient condition for EVyQ > 0 is <b'(Fy ) a 0 for Fy a 2 F 0 , plus the 
fondition given above for the forward case. One special case in which the cross-partial is 
jositive occurs when h is uniform and symmetric about EC\ = Co + k - R' and w is a 
Dnstant absolute risk aversion utility function, with F 0 — ECy. But. in general, a 
..waive cross-partial is a restrictive condition. 
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turbed pdf exhibits an inventory effect that leaves EF t unchanged, 
and with F 0 falling as short hedging increases, a backwardation equi¬ 
librium results. This is summarized in the following proposition. 

Proposition. In the case of a true futures market in which (a) all 
participants have identical strictly concave utility functions, (b) the 
futures market starts at an initial martingale equilibrium (F 0 = EF t ), 
( c ) there are N long and N short hedgers, (d) all participants agree to 
the pdf over Fj, and (e) cash and hedged commodity commitments are 
complementary from an expected utility perspective (EUyQ > 0), then 
for any common concave utility function there exists an inventory 
effect such that the resulting market equilibrium exhibits backward¬ 
ation. 

The work leading up to the summary proposition provides a set of 
sufficient conditions for an inventory effect to generate a backwarda¬ 
tion equilibrium in a world with an equal number of identical short 
and long hedgers. The proof of the proposition utilizes only the qual¬ 
itative properties of the first-order conditions so that less restrictive 
sufficient conditions can certainly be derived. However, they will lie 
sensitive to the specific properties of the utility function assumed to 
characterize short and long hedgers. What the proof does make clear, 
however, is that an inventory effect depends not only on the level of 
the cash price but on the level of the futures price as well. This serves 
to emphasize that Houthakker’s argument about the behavior of the 
basis relative to only the cash price is generally insufficient for back¬ 
wardation, given the dictates of expected utility maximization. This is 
observed in the design of our inventory effect perturbation. 

The proof also provides a further indication of why it is that we 
have interpreted the inventory effect in this paper as that the proba¬ 
bility that cash and futures prices are closer together at low cash and 
futures prices is larger than when both prices are high. In contrast, 
defining the inventory effect specifically in terms of the properties of 
partial correlation coefficients at high versus low cash prices does not 
lend itself to simple proofs of backwardation. The reason for this is 
that correlation coefficients aggregate over ranges of cash and futures 
prices, and, as the proof indicates, a much finer kind of specification 
of the inventory effect is needed to prove backwardation for an arbi¬ 
trary concave utility function. 9 

9 The essence of our argument for backwardation, based on the inventory effect, is 
that the joint density over cash and futures prices is asymmetric in a special way. 
Because of the hypothesized asymmetry, this means that the mean-variance approach 
adopted in much of the literature dealing with futures markets (including a mean- 
variance approach in Houthakker's original paper) is inapplicable in analyzing the 
inventory effect since it is only in the case of an underlying normal (symmetric) distri- 
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In another paper, Fort (1986) provides a framework for investigat¬ 
ing the empirical presence of the inventory effect based on expression 
(15). Note that this expression can be cast as a simple comparison of 
cdf’s at low and high cash prices. Making such a comparison for 
March wheat contracts, 1968-82, Fort finds that such an effect is 
quite pervasive in observed futures price distributions. By our theo¬ 
retical results and this related empirical analysis, the existence of 
backwardation again comes to the fore as an important element in the 
controversy over the ability of speculators to earn long-run profits. 


V. Conclusions 

In this paper, we have developed a set of sufficient conditions for the 
existence of backwardation in a true commodity futures market, on 
the basis of the notion of the inventory effect, as originally identified 
by Houthakker. The appeal of the inventory effect approach to back¬ 
wardation is that it is based explicitly on the institutional features of 
operating futures markets, including especially the flexibility of fu¬ 
tures contracts and the arbitrage relation linking cash prices to fu¬ 
tures prices. Working with a true futures market model appears to be 
essential in developing a theory of price patterns on futures markets. 
Flexibility of delivery alternatives in futures contracts is a fact of life 
in these markets and introduces complications that can cause major 
modifications in theories developed for markets in forward contracts. 

We have used a basic arbitrage relation to derive the joint density 
over cash and futures prices as of the maturity date of the futures 
contract and have examined the effects of introducing asymmetry 
into this density in the form of a special kind of inventory effect. What 
we have found is that it is useful to reformulate the notion of an 
inventory effect in a manner different from that of Houthakker’s 
original formulation and that the mere presence of an inventory ef¬ 
fect is not sufficient to establish a backwardation equilibrium: the 
inventory effect must be of a specialized type if backwardation is to be 
proved. We have explicitly avoided introducing into our model antici¬ 
pated changes in the basis. As our proposition shows, it is possible to 
prove the existence of backwardation without introducing anticipa¬ 
tions, which we feel is a weak link in Houthakker’s original formula¬ 
tion. 


bution that a convincing argument can be made for the mean-variance approach. This 
gives one more indication that the current literature is in fact concerned primarily with 
a study of forward markets and not with true futures markets. 
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Bilateral Trading as an Efficient Auction 
over Time 


Donald R. Deere 

Texas A&M University 


A market composed of pairwise trading under incomplete informa¬ 
tion is modeled in order to analyze how resources are allocated 
among competing uses when information about trade gains is in¬ 
complete. Contrary to the results from studying a single such trade, 
sufficient homogeneity across potential trades guarantees that effi¬ 
ciency obtains. This is analogous to simple first-price auctions with 
homogeneous bidders, where bidders have a common bid function 
and, as a result, the high bidder also places the highest value on the 
auctioned object. With enough symmetry, the decentralized bilateral 
trades in the present model occur as if they were made in a first-price 
auction that occurs through time. The robustness of the efficiency 
result to heterogeneities among agents and to nontrivial search in¬ 
tensity decisions is then considered. 


I. Introduction 

The efficiency of a perfectly competitive market follows from th 
implicit assumption that trade takes place between atomistic agent 
and the market. But people trade with each other, not with the mai 
ket, and these trades generally occur between pairs of agents wit! 
incomplete information. Trade between a pair of agents with inconi 
plete information has been shown to be inefficient (Myerson am 
Satterthwaite 1983). An extrapolation of the inefficiency of pairwis 
trading could lead to the conclusion that a “market” composed o 
such trades is also inefficient. 

I thank Jim Albrecht, Ray battalio, Sevcrin Borenstein, Steve Bronars, Jeff Miron 
Bill Perry, Robert Reed, Tom Saving, Steven Wiggins, and an anonymous referee ft) 
helpful suggestions. Any remaining errors are my own. 
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The purpose of this paper is to show that this conclusion of ineffi¬ 
ciency is not necessarily correct, that extrapolation from a single trade 
to the market must consider the presence of alternative trading op¬ 
portunities. Individual bilateral trades are inefficient because of the 
tension that each agent perceives between trying to make a trade and 
trying to capture surplus. When all potential traders face the same 
tension, then the private decisions among trading alternatives can 
capture all the available surplus in the market. This paper presents a 
formal analysis of such a market and provides sufficient conditions 
under which efficiency will obtain. The paper then goes on to analyze 
more general settings in which efficiency may or may not occur. 

It is well known that first-price auctions with symmetric, risk- 
neutral buyers generate efficient outcomes, even though reservation 
values are private information (Vickrey 1961). In an auction setting, 
the ever-present tension between buying the object and paying a low 
price is the same for all bidders (formally, they all use the same bid 
function). As such, the bidder placing the highest value on the auc¬ 
tioned object gets it, though at a price below his or her reservation 
price. Thus the incentive to make a good deal need not preclude 
efficiency. 

With enough symmetry, the decentralized bilateral trades in the 
present model occur as if they were made in a first-price auction. The 
distributions of past and expected future wage offers in this model 
over time provide the competition usually generated by other, con¬ 
temporaneous bidders. A common wage-offer function characterizes 
the behavior of each firm as it attempts to hire a worker. Strict mono¬ 
tonicity of this wage-offer function guarantees that workers are ai¬ 
rways employed in their highest-valued use, given the matching tech- 

H«gy- 

I While the market is efficient, it remains true that the outcomes of 
pome individual trades appear inefficient. The efficiency of a single 
bilateral trade is determined by the buyer’s valuation, b , and the 
teller’s valuation, s. In a market setting, b and s are not exogenous; 
Ihey depend on the trading alternatives available to the buyer and 
|eller. Hence b and s are shadow prices that reflect the private value of 
(ither trades. Given that these prices are not equal to the social value 
f alternative trades, the implications for efficiency at the market level 
4 ° not depend on just a direct comparison of b and j, as in the single- 
/ade case. 1 

1 Section II, a simple model and example are presented to illus- 
•ate the main ideas. Section III develops the formal model and de¬ 
duces its efficiency properties. The robustness of the efficiency result 
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to various generalizations of the basic model is analyzed in Section IV. 
Section V concludes with a brief summary of the paper’s main results. 

II. A Simple Model and Example 

This section constructs a model in which turnover of employees is 
efficient but entry of new firms is not. This latter inefficiency results 
from the simple rules regarding how firms enter and exit the market 
that are chosen to make the example particularly tractable. These 
rules are relaxed in Section III. A model of bilateral trades under 
incomplete information in a continuous-time labor market is devel¬ 
oped. The economic problem is to allocate workers across firms with 
differing productivities. The analysis is restricted to steady states. 
There are N risk-neutral firms and T homogeneous, risk-neutral 
workers, with N and T assumed large. Workers retire at the expected 
rate r (r is the parameter of an exogenous Poisson process) and are 
replaced by new', unemployed workers. Each firm, in order to pro¬ 
duce, must hire one worker but cannot hire more. Firms differ with 
respect to their production values, Y, with each firm's value defining 
the constant flow of output that it can produce through time. The 
production values of all firms are drawn from the exogenous uniform 
distribution A'(-) with support [Y, Y\. Workers can be hired out of 
unemployment, where they receive the How of leisure value h < Y, or 
they can be hired away from another firm. Once hired, a worker is 
employed at some firm, but not necessarily the same one, until retire¬ 
ment. A firm that is unsuccessful in hiring a worker, or that becomes 
unattached because its worker retires or quits for a better job, goes 
out of business (at no cost) and is replaced by a new, vacant firm with a 
newly drawn production value.' 

All workers make contact with a randomly chosen vacant firm at the 
expected rate a (a is the parameter of an exogenous Poisson process) 
and solicit a wage offer. 2 3 4 The firm cannot observe the reservation 
wage of the prospective worker (nor the worker’s current employ¬ 
ment status), and neither can the worker observe the firm’s produc¬ 
tion value. Given no incentive-compatible revelation mechanism, it 

2 The turnover of firms is meant to represent sectoral demand shocks. This is the 

reason that production values are specific to a firm rather than to a match. The assump¬ 
tion that a firm goes out of business if it fails to hire a worker ori its first (and thus only) 
try or if its worker quits or retires is relaxed in the next section. 

4 This is the linear search technology case of Diamond and Maskin (1979). The 
probability that an individual worker contacts some vacant firm at all is exogenous and 
independent of both the number of vacant firms and the number of workers: hence 
aggregate contacts increase linearly with the number of agents. Modeling the contact 
rates as choice variables is discussed in Sec. IV. 
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is assumed that the firm makes a first and final wage offer to the 
worker. 4 Acceptance by an unemployed worker generates a new 
match, whereas rejection by an unemployed worker results in a new 
firm’s (with a newly drawn Y) replacing the offering firm and the 
unemployed worker’s continuing to make random contacts. A new 
match is formed if the offer is accepted by an unemployed worker, 
and the worker’s former employer goes out of business. If the offer is 
rejected, the offering firm disappears. In either case, a new vacant 
firm replaces the vanishing one and the prcxess begins again. 

The new firm does know the distribution, G(-), of possible reserva¬ 
tion wages for the worker. 5 In choosing its wage offer, the vacant firm 
also expects that if a match is formed there will be a series of compet¬ 
ing wage offers (unobserved by this firm) made over time to its em¬ 
ployee until the worker either quits or retires. These future offers are 
governed by the distribution F(-) and are received at the contact rate 
a, which is unaffected by the individual firm. Given that the firm 
knows a and F(-) but cannot observe the actual arrival of a competing 
offer, a stationary strategy is optimal. Thus the offered wage is for a 
constant payment through time. 

Vacant firms experience the arrival of workers seeking a wage offer 
at the exogenous, expected rate s. 1 ’ The vacant firm’s wage offer, v. 
should maximize expected profits conditional on Y, II(K). When a 
zero interest rate is assumed, fl(T') is 7 


4 This assumption requires that the firm have the ability to commit to such a strategy. 
To be fair, most of the literature on bilateral bargaining is concerned with determining 
what mechanisms are feasible and optimal in such a setting. In this paper a posted-price 
mechanism generates an efficient allocation of labor. In the rase of a single bilateral 
trade under incomplete information, such a mechanism is inefficient. The point, then, 
is that judging efficiency f rom a single trade may be missing something. For analyses 
that help to justify the assumption of lirsl and final offers, see Samuelson (1984), 
Hagen y and Rogerson (1985), Ilorstmann (1985), and Ferry (1986). 

’ The distribution G(-) (similarly for F(’) below) is much like a prior distribution. 
F.ven though there arc a finite number of workers (and firms), G(-) is continuous 
because at any point in time each worker can have any reservation wage form the 
support of G(-). The distribution of actual reservation wages is, of course, discrete, but it 
is the distribution at possible reservation wages that is relevant to a vacant firm. 

’ I he expected rate i will depend on the parameters r, a, N, and T and is calculated in 
the next section. Each worker is assumed to make contact with a random vacant firm 
according to an exogenous Poisson process with parameter a. Hence the arrival of a 
given worker at a particular vacant firm is a Poisson process with parameter a/(no. of 
vacant firms). Therefore, the arrival of any worker at a particular vacant firm is a 
Poisson process with parameter * = a(no. of" workers)/(no. of vacant lirins) because the 
sum of independent Poisson variates is also Poisson. 

If yy(F) is the maximized value of being a new, vacant firm and V>(V) is the 
maximized value of a filled firm with product value Y, then the fundamental recurrence 
relation of dynamic programming implies the following steady-state equilibrium value 
equations (see Mortensen [1978, 1982a] or Diamond 11981, 1982] for more details): 
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noo = 


G(at)(K - a>) 
r + o[l - F(a/)]’ 


(1) 


where G(w) is the probability of hiring a worker, l/{r + a[l - F(w)]} is 
the expected duration of this employment relationship, and (Y - w) is 
the surplus the firm receives conditional on having a worker. The 
firm must trade off the probability of getting (and keeping) this sur¬ 
plus with the size of the surplus. The wage offer divides the surplus 
and provides incentives and is therefore an imperfect signal of pro¬ 
ductivity. With all wages being noisy signals, decisions based on wage 
comparisons have no self-evident efficiency implications. Because it is 
necessary to analyze more than one trade before judging efficiency, 
the determination of wage offers must be examined more closely. 

The distribution, G(-), of possible reservation wages is composed of 
the leisure value, h, and all the wages being paid in the various 
matches. 8 I f it is assumed that a worker leaves her current position 
only for an increased payment (i.e., "ties” go to the incumbent), then 
no firm offers a wage w < h (hence F(h) - 0), and G(/i) is the propor¬ 
tion of unemployed workers. Noting also that a worker with current 
wage u> retires w'ith instantaneous probability r and takes a new job 
with instantaneous probability afl - F(w)] implies that in a steady 
state 


0, w < h 


G(w) 


_r_ 

r + a[I — F(w)] ’ 

1, 


h s w s w 
w > w, 


( 2 ) 


where w is the maximal wage offer; F(W) — 1. 9 Note that r/(r + a) is 
the expected proportion of workers who are unemployed because any 
wage offer is accepted by unemployed workers (in equilibrium). 


•Vv(Y) = s{G(tu(K))(l'V(I') - V'v(Y)] - (I - C(u.(K))]V v (K)}, 
iV f (Y) = Y - u>(Y) - {a[l - f(u/<n)l + r}V F (Y), 
where 1 is the interest rate. As i —* 0 this gives 

»- wi - , 

* The worker’s reservation wage is equal to the flow payment of her current position 
because there are no mobility costs, the payment in any one position is constant through 
time, and employed workers can continue to contact vacant firms at the rate a. Equa¬ 
tions for the worker analogous to the first two equations in n. 7 derive this explicitly. 

9 The term 0(u>) is derived as follows. The flow rate of workers into G(u>) is just r, the 
flow of new entrants. The flow rate of workers out of G(u>) is composed of retirees, 
rG(u>), plus net quits, aG(ut)( 1 - F(iu)]. In a steady state these flows must balance. 
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A vacant firm’s first-order condition for the optimal wage offer, w *, 
given E, can be written as (from [1]) 

(Y - w*)G'(w*) + n*(E)aF'(u/*) = G(u>*). (3) 

The left-hand side is composed of the gain to the firm via a higher 
probability of a successful hire and a lower probability of a quit 
achieved from an increase in the wage offer. This is equated to the 
increased cost to the firm from a higher inframarginal wage payment. 
The firm balances paying a higher wage to attract and keep a worker 
against the reduction in the firm’s surplus that this entails. By sub¬ 
stitution for G(u>), (3) can be rearranged to give 


Y - w* + 


r + a[l - F(w*)] _ 
2aF'(w*) 




(3’) 


The distribution, F(-), of potential competing offers is merely the 
steady-state distribution of possible wage offers made by new, vacant 
firms. Thus equilibrium F(-) is defined by 


f(») - (Ml - jcj a-fW ~ 

_ 2awF'(w) + r + q[l — F(it>)) _ X. 

2aF'(w)(Y — Y) Y - Y‘ 


(4) 


The product values of new replacement firms are always drawn from 
the uniform distribution K(-), and these firms always face the same 
distributions of possible reservation wages and potential competing 
offers in a steady state; hence the symmetry. In a steady state each 
new firm, given its Y value, is always solving the same problem when 
making a wage offer; hence, in equilibrium, there is a common wage- 
offer function. For an equilibrium to exist it is necessary that, given 
the uniform distribution of the Y values and the parameters r, a, h, N, 
and T, a function F(-) exist that satisfies (4) and qualifies as a cumula¬ 
tive distribution function (cdf) on some interval [h, 55] with strictly 
positive density on the interior of this interval. The density, denoted 
F'(-), must be positive on this open interval for (3’) to be the correct 
first-order condition. 

Efficiency of the labor allocation requires that an employed worker 
accept a new offer if and only if output in the new match is greater 
than current output. The optimal wage-offer function, u'*(Y) defined 
by (3’), is common to all new firms in a steady state. Thus a necessary 
and sufficient condition for efficient turnover is that zc*(F) be strictly 
increasing. A worker currently earning u»i in a job producing Y | has 
an incentive to accept another job producing Eg if u »2 > ifj. With Eo 
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> a/?, then Yz > w i > w 2 is possible and, with this transaction viewed 
alone, it appears that the worker's refusal is inefficient. 

This approach ignores the source of the worker’s alternative valua¬ 
tion, ttq = and thus incorrectly judges efficiency on the com¬ 

parison of Y 2 and w t rather than on the comparison of Y 2 and Y t . 
Strict monotonicity of w*(Y) will guarantee that private incentives 
coincide with efficiency conditions (i.e., w 2 > w x iff Kg > K t ). The offer 
w* is a true maximizing choice and dw*/dY > 0 provided that the 
second-order sufficient condition from (1) is satisfied. This condition 
is equivalent to 0'(ui*) > 0 or {r + of 1 - / i ’(u>*)]} F"(w*) < a[/ , ’'(u>*)] 2 . 

In the special case in which h = Y - [(r 4- a)/u](Y - f) > 0, (4) can 
be solved using F(h) — 0 to give a uniform distribution for /*■(•): 10 

fi(w) - —-—-= —--. ( 5 ) 

2 a 2 (Y - Y) 2(Y - Y) 

This implies that the optimal wage-offer function is, from (3'), 

w*(Y) = 2 Y - Y - T -^-^ (Y - Y) (6) 

~ a ~ 

so that the wage offer increases $2 for each $ 1 increase in the firm’s 
production value. The strict monotonicity of the common wage-offer 
function guarantees efficient labor turnover. 

Even though (1) information is poor, (2) all trades are bilateral and 
are mediated only by the wage offer, and (3) earh firm privately 
chooses its wage offer to trade off incentives anti surplus, the result¬ 
ing interfirm labor turnover is efficient. How so? Essentially, firms are 
involved in a first-price auction that occurs through time. Instead of 
bidding directly against N - l other bidders, the firm faces the collec¬ 
tion of potential past (accepted) bids summarized in (»(•), as well as the 
possibility of expected future bids summarized by F('). It is well 
known (Vickrey 1961) that, with ex ante homogeneous, risk-neutral 
bidders behaving noncooperatively, the highest bid will be made by 
the bidder placing the highest value on the auctioned object (because 
each uses the same bid function); hence efficiency. In this model, 
given the common distribution K(Y), each new firm will use the same 
(strictly increasing) wage-offer function; hence tci(Vj) > w 2 (Y 2 ) iff Ej 
> K 2 for all Y\, K 2 , thus generating efficient labor turnover. 

This example highlights the idea that an auctionlike mechanism 
can govern trades in a bilateral setting and ensure that all available 

10 Regardless of how “ties” are resolved, there would be a discrete change in the 
probability either of a hire or of a later quit (or both) from offering a wage infinitesi¬ 
mally above h as compared with offering w = h. Thus F{h) = 0. If the lowest offered 
wage, u;, were strictly greater than k, it would pay to reduce w. Thus F(h + c) > 0. 
Hence this is the correct initial condition. 
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trade gains are captured. The next section develops a more general 
model that relaxes the assumption that a firm has only one “chance” 
at a worker. This artifice made the problem more tractable, in partic¬ 
ular, by causing the steady-state distribution of potential output 
values across vacancies to be identical to K(-) (because every vacant 
firm was a new firm). This simplification provides much insight but 
has one drawback, namely, that the arrival of new productive oppor¬ 
tunities to the economy is affected by the job-taking behavior of un¬ 
employed workers. If an unemployed worker refuses a firm’s offer, 
then that firm is replaced, whereas acceptance by an unemployed 
worker generates no replacement. (The job-taking behavior of em¬ 
ployed workers does not have such an effect because with either deci¬ 
sion some firm is replaced.) This externality via changes in productive 
opportunities implies that the acceptance decisions of unemployed 
workers are not efficient. There are too many firms with low Y values, 
but no worker leaves a firm for one with a lower Y. 

III. An Efficient Bilateral Trading Market 

The model of the previous section is modified as follows. All firms, 
vacant and filled, go out of business at the expected rate b (which is 
the parameter of an exogenous Poisson process) and are replaced by 
new, vacant firms with newly drawn production values from the dis¬ 
tribution K(-) (no longer assumed uniform). A firm that is unsuccess¬ 
ful in hiring a worker or that becomes unattached because its worker 
retires or quits for a better job retains the same production value, 
remains in (returns to) the pool of vacant firms, and awaits a contact 
by another worker. An employed worker whose firm goes out of 
business experiences a permanent layoff, returns to the pool of un¬ 
employed workers, and continues to contact vacant firms. Again, all 
weaker contacts are made at vacant firms at the exogenous expected 
rate a. In a steady state there are four different agent positions: filled 
(F) and vacant (V) for firms and employed (£) and unemployed (U) 
for workers. 

The vacant firm’s problem is to choose a wage offer that maximizes 
expected profit, which, when the stationarity of the model is noted 
and the assumed zero interest rate is maintained, is given now by 11 

1 T his is derived Irani the following equations as »—» 0 noting that 11 (K, w) = \\ (Y, 
"•') (see n. 7): 

*V,.(K) = .<G(ki(K))[VV(K) - V v {Y)] - bV v {V), 

,V "( y ) * Y - MY) - \r + a[l - F(u.(K)))}[lV(K) - V f (F)] - bV t (Y). 

With N and 7 finite, there are two complications. First, a firm that is turned down or 
oses a worker would update slightly its prior about the actual distributions of reserva- 



io8 


Tl(Y, w) 
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_ sG(id)(K - id) _ 

b{r + b + sG(u>) 4- o[l - /'(id)]}' 


(7) 


Again the firm must trade off the probability of hiring and retaining a 
worker against a reduced conditional surplus. The steady-state distri¬ 
bution of possible reservation wages, G(-), is now 


G(id) - 


( 0, id < h 


r + b 

r + b 4- a[l - /»] ’ 

1 , 


h £ ID £ ID 
ID > 55 


( 8 ) 


where W is the maximal offered wage; F(55) = 1. The expected pro¬ 
portion of unemployed workers, equal to G(h ) given F(h) = 0, is (r 
+ b)/(r + b + a). The expected number of employed workers, {1 - 
f(r + b)/(r + b + a)]}7\ must equal the expected number of filled 
firms. Hence the expected number of vacant firms is N — [ aT/(r + b 
4- a)], which is assumed positive. The aggregate expected contact rate 
of all workers is aT. Hence the arrival rate of workers at a firm is (see 
n. 6) 


__ aT _ _ a(r 4- b + a) 

S N - fa7’/(r 4- b 4- a)] 8(r 4- b 4- a) — a’ 

where 8 * NIT is the ratio of firms to workers. Note that 8=1 implies 
j = a/G(h). 

The first-order condition for the vacant firm's optimal wage offer, 
id*, can be written as 


[Y - id* - bY\{Y, id*)]G'(id*) 4- IKK, w*)~F'(w*) = G(id*). (9) 


This equates the gain via a greater probability of a hire and a lower 
probability of a quit achieved from an increase in the wage offer to 
the increased cost of a higher inframarginal wage payment. 

Definition. A noncooperative equilibrium is a cdf, F*(-), that in¬ 
duces each vacant firm to choose an expected profit-maximizing wage 
offer, given /*(•). such that the distribution of all possible wage offers 
is /■*<•)• 

Proposition 1. The distribution F*(-) is continuous and strictly 
increasing on ( h , 55). 

Proof. If F*'(-) = 0 on some interval [c, d ), with h < c < d rs 55, then 


tion wages and competing offers. Second, given this updating, a worker would have a 
slight incentive to behave strategical!)' by refusing some current offers in order to 
increase expected future offers. It is assumed that N and T are large and that r and b 
are not too small relative to a; hence these complications are ignored. 
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any firm offering a wage of d could lower its offer to c, thus increasing 
the conditional surplus without affecting the probability of either 
hiring or retaining a worker. If there is a discrete increase in F*(-) at t, 
with ft < e <u>, then any firm offering a wage of e could increase its 
expected profit by raising slightly its wage offer. Regardless of how a 
worker decides between equal offers, a slight increase in the wage 
from e would result in either a discrete decrease in the probability of a 
quit or a discrete increase in the probability of a hire, or both, while 
decreasing the conditional surplus by only an infinitesimal amount. 
Q.E.D. 

Proposition l does not guarantee the differentiability of F*(-), 
which is necessary for (9) to be the correct condition defining the 
firm’s optimal wage offer. 

Proposition 2. If K( •) is continuously differentiable and strictly 
increasing on (P, P), then F*(-) is differentiable on (h, w). 

Proof. Suppose, by way of contradiction, that w is a point of nondif¬ 
ferentiability of F*(-). Denote the left-hand derivative. F*'(w )~. as L 
and F*'(w) + as R. If L > R, then w solves (9) for a range of Y values 
and would be offered by every firm with a production value in this 
range. Thus F*(-) must be discontinuous at w, a contradiction. If R 
> L, then denote P] as the Y value that solves (9) using F*'(m')“ for 
F'(w), and denote P 2 as the solution to (9) using F*'(w ) + . This implies 
P 2 > Y 1 . Denote w + as the solution to (9) with Y, and F*’(-) + , and 
denote u>~ as the solution to (9) with P 2 and F*’(-)~. No firm will offer 
any wage in the interval (w ~, w *) where u>~ < w < u> + , a contradic¬ 
tion. Q.E.D. 

Proposition 2 implies that the optimal wage offer for each possible 
Y is either at a corner or in the interior of [A, w] with dH/dw = 0 and 
d 2 n/chu 2 < 0. This argument, combined with proposition 1, yields the 
following result: If an equilibrium exists, then it must be character¬ 
ized by a wage-offer function that is composed of interior optima for 
all potential values of Y except, possibly, Y and Y. 

Now to establish that such an equilibrium does exist. By substitution 
for G(w), (9) can be rearranged to give 


(P — w*)F'(w*) 


{r + b + o[l - F («>*)}} 2 + $(r - 1 - b) ,, 0 
2a{r + b + a[l - F(u'*))} ‘ 1 ' 


In equilibrium, F(w*) is the steady-state distribution of possible wage 
offers made by the vacant firms. The Appendix shows that, given 
K(Y), equilibrium F(u>*) must satisfy 



_ «*[! ~ F(w*)} _\ = 

8 (r + b + «){r + b + a[l - F(u'*)U / K ' 


(ID 
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where w* = w*(Y) is the firm’s optimal wage offer. Note that F(w*) 
and K(Y) are no longer related by a simple composition of the inverse 
wage-offer function as in Section II because not all vacant firms are 
new entrants. Given K(Y), equilibrium F(w*) defines an optimal w(Y), 
according to (10), that induces this F(w*), according to (11), as the 
distribution of potential wage offers. Inverting K(Y) in (11) gives 


Y 



_ d z F (a>»)[l - T(a>*)] _ 

5 (t + b + a){r + b + a[l — T(w*)]} 




( 12 ) 


Substituting into (10) and rearranging yields the differential equation 


F'(u>*) 


{r + b + a[l - / r (a>*)}} 2 + s(r + b ) 

2a{r + b + flfl - f(u;*)l}{A'- J t()(w*)l - a>*} 




(13) 


Proposition 3. If K(Y) is a continuously differentiable and strictly 
increasing cdf on \ Y, yj, then a unique, noncooperative equilibrium, 
F*(-), exists. 

Proof. See the Appendix. 

Proposition 4. The induced equilibrium wage-offer function, 
w*(Y), is strictly increasing; hence, the allocation of labor is efficient. 

Proof. See the Appendix. 

The extension of the efficiency result to the case of a discrete pro¬ 
ductivity distribution is as follows. Assume that all firms have a com¬ 
mon production value Y. The equilibrium (in which all firms receive 
the same expected profit) wage-offer distribution is obtained by solv¬ 
ing the differential equation for F(-) that is the first-order condition 
for each firm. Hence, replace Y with Y in (10) and solve for F(-) using 
the initial condition F(h) = 0. This is analogous to the equilibrium 
derived and shown to be unique in Mortensen (1985). 

Now assume that K(Y) is a two-point distribution {Pj, P 2 ) with prob¬ 
abilities p and 1 ~ p, respectively. Let F,(-) denote the equilibrium 
wage-offer distribution in the common product value case when the 
common productivity is Y t . Using the initial conditions F\(h) = 0 and 
F*(S»,) = P> where Ti(Si) = p, yields the equilibrium 

F * (W ) = t for 

' ( F^(u>) for u/ 2 - 


This argument can be extended to any discrete productivity distribu¬ 
tion (note that points of nondifferentiability of F*(-) are now possible 
because K(-) is discrete). Efficiency obtains because firms with strictly 
greater production values offer strictly greater wages. 
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IV. Robustness 


in 


The robustness of the efficiency property with respect to certain re¬ 
strictive assumptions must be explored to put the results in perspec¬ 
tive. Three types of generalizations are considered. Heterogeneities 
are introduced among both workers and firms with the result that 
certain heterogeneities weaken the efficiency result but others have 
no effect. The discussion below attempts to clarify just what homoge¬ 
neities are required for efficiency. This investigation of homogeneity 
is then extended by consideration of a more sophisticated bargaining 
game in which the ability to commit is reduced. Finally, allowing 
workers to choose how hard to search may preclude efficiency, 
though it is possible that efficiency continues to obtain even with 
endogenous search intensities. 

A key feature of the model is that leisure value is the same for each 
worker. If workers differ in their values ofleisure so that G(-) has no 
mass point and firms cannot observe these differences, then overall 
efficiency will not obtain. Some firms will find it in their interest to 
offer w < h (where h is the highest leisure value) even though Y > h. 

Thus too many workers refuse initial employment, though interfirm 
turnover remains efficient (iflabor supply is elastic, then, for the same 
reason, there will be underemployment of labor). In this case prices 
(w and h ) do not provide correct signals for comparing alternatives 
because the information structures of the two bargains differ. The 
worker's “trade” with nature for leisure is effectively a complete infor¬ 
mation bargain, whereas the trade with the firm is a bargain undeT 
incomplete information. 12 When there is complete information, the 
tension between making a trade and capturing surplus is more easily 
diffused by making a separate decision on each. 

The implications of differing leisure values reinforce the analogy 
between this model and an auction. If the seller announces a mini¬ 
mum bid. then the object will be sold if any bidder has a valuation 
above this level. If the seller has a private reservation value drawn 
from some nondegenerate distribution, then it is possible that no 
bidder will win the object, but if anyone bids high enough to win, it 
will be the bidder with the highest valuation. 

It firms draw their production values from differing distributions, 
efficiency will obtain. In contrast, heterogeneous distributions in a 
simple auction setting imply that bidders, conditional on their valua¬ 
tions, are solving different problems when they choose a bid. In such 

^*’’ s * s the feature that results in inefficiency in the model of Perry and Solon 
(1985). In their model the worker's alternative wage is assumed not to arise from 
Bargaining and is analogous to (differing) leisure values in this paper. 
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a setting each bidder faces a different set of N - 1 opponents; hence 
there is no longer a common bid function and inefficiency is possible. 

What about for an "auction over time”? Each vacant firm is bidding 
against distributions of expected existing wages and expected future 
wage offers, not a set of contemporaneous bidders. In a steady state 
these distributions will be the same for every vacant firm. Hence, 
given its product value, each vacant firm solves an identical problem 
because the (prior) distributions of possible reservation wages and 
potential competing offers remain constant in a steady state. 13 Effi¬ 
ciency again obtains because the same, strictly monotonic, wage-offer 
function dictates the behavior of every vacant firm. This results in 
workers’ always choosing to produce where output is higher. In¬ 
troducing heterogeneities, then, does not automatically destroy the 
ef ficiency properties of the model, though it does emphasize that it is 
the homogeneity of the bargaining environment, rather than of the 
bargainers themselves, that is necessary for efficiency. 

The importance of a homogeneous environment is made clear by 
considering bargaining games in which the ability to commit is re¬ 
duced. In an infinite-horizon bargaining game with no commitments, 
Cramton (1984) shows that trade occurs if and only if b > s. In the 
model of Section III, this mechanism would imply that, when a work¬ 
er’s current wage is taken as exogenous (the s value), there would be too 
much turnover. Some agreement between the worker and the incum¬ 
bent firm (which must occur before production can begin) induces a 
potential heterogeneity between the worker’s bargain with the new 
firm and the bargain with the incumbent firm. In the extreme case 
just mentioned, the agreed-on wage is exogenous and functions as a 
take-it-or-leave-it offer. Thus the cnicial feature for efficiency is that 
the worker's relationship to each employer must be homogeneous: if 
there is active negotiation with the new firm, then there must be active 
negotiation with the incumbent firm. As a conjecture, it would seem 
that if the bargaining mechanism is sufficiently homogeneous across 
all bargains between firms and employed workers, then the efficiency 
of the turnover decision will be preserved provided that the wage 
outcome is striedy increasing in Y. 

Much of the search and matching literature (see, e.g., Mortensen 
1978, 1982a, 19826; Pissarides 1984a, 19846) is concerned primarily 
with endogenous search intensities. The analysis here would be in¬ 
complete without a word on these choices and their implications for 

** With heterogeneous product value distributions the assumption that firms have 
constant ¥ values is important. If firms draw another Y when an offer is refused, then 
each firm will not face the same alternative to an unsuccessful offer because expected 
new draws will differ, and thus all firms cannot be characterized by the same wage-offer 
function. 
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efficiency. Suppose that workers choose a, the expected rate at which 
they make contact with vacant firms, subject to some cost. This implies 
that a = a(w) with a’(tv) < 0 for each worker in equilibrium. Hence 
firms have an additional consideration when making a wage offer: a 
re lati vely well-paid worker will search less intensively for another job. 
The wage-offer function would remain strictly monotonic, and work¬ 
ers would again act “as if” they were comparing the Y values of firms 
when deciding whether to quit or not. The efficient choice of inten¬ 
sities seems unlikely, however, because this requires that workers also 
choose a as if they were comparing these Y values. This would occur 
if, and only if, ui'(Y) - 1 because the worker considers only the 
expected (truncated) value of the wage from more intensive search. 
There is no reason to suspect that such a wage-offer function would 
necessarily emerge in equilibrium. It is worth noting, however, that 
the required wage-offer function (output less a constant) is exactly 
what would emerge as the optimal principal-agent contract in the case 
of a risk-neutral agent. With all production opportunities housed in 
one large firm, workers would be paid in this manner (if turnover 
must be accomplished via a decentralized mechanism). An interesting 
question, then, is under what conditions (if any) such a payment 
scheme would emerge in a noncooperative environment. 

V. Conclusion 

This paper has developed a steady-state equilibrium model of bilat¬ 
eral trade in order to analyze how resources are allocated among 
competing uses when information about trade gains is incomplete. 
While it is quite plausible that many ex post Pareto-optimal transac¬ 
tions do not occur when a single two-party negotiation is being consid¬ 
ered, a “market" characterized by successive bilateral trades can be 
efficient. The alternatives to a given trade must be considered in 
order to assess efficiency. When these alternatives arise from trades 
with a similar bargaining environment so that similar bargaining ten¬ 
sions exist, then the decisions based on private incentives will correctly 
order alternatives from a social viewpoint as well. As an example, this 
paper shows that the "excessive” quits or layoffs in Hall and Lazear 
(1984) are not inefficient if the single worker-firm bargain they con¬ 
sider also characterizes each agent’s market alternatives. 

The model is meant as a starting point for the analysis of markets 
characterized by transactions between pairs of agents when informa¬ 
tion is incomplete. Even though there is no auctioneer, the insights 
from auction theory may bear useful fruit in such analyses. A key 
point is that market participants face a distribution of opportunities 
instead of a single price. If market participants face the same distribu- 



11 \ JOURNAL OF POLITICAL ECONOMY 

tion and are sufficiently homogeneous, then all available trade gains 
are captured. The investigation of how endogenous search intensities 
will alter this conclusion is an important next step. 


Appendix 


A. The Steady-State Distribution of Possible Wage Offers 

All wage offers are made by vacant firms. Hence, the distribution of potential 
wage offers depends on the distribution of production values, V, across vacant 
firms. Given that < 0, dw/dY > 0 so that each vacant firm can be 

identified by its wage offer. The expected flow rate of firms offering a wage 
less than or equal to tv into the pool of vacant firms is 

bK(Y) + T [G(») - G(A)] + | [G(w) - G(A)J[1 - #»]. (Al) 

This inflow is composed of new entrants, firms whose workers retire, and 
firms whose workers quit. (Note that the number of filled firms equals the 
number of employed workers and that 8 is the ratio of firms to workers.) The 
expected flow rate of firms offering a wage less than or equal to w out of the 
pool of vacant firms is 

bF(w)fi + sF(u')$a, (A2) 

where 3=1- [(«/8)/(r + b + s)j is the expected proportion of vacant firms 
and ot = (r + b)/(r + b + a) is the expected proportion of unemployed 
workers (note that 3 = a/5.t). This outflow is composed of firms that go out of 
business and firms that hire an unemployed worker. Note that a hire of an 
employed worker does not generate a net inflow to F(w) because this worker’s 
former employer is now vacant and offers a wage less than w. In a steady state 
these two Hows must balance for every w. Equating (Al) and (A2), substitut¬ 
ing for 3 and for G(-) from (8), and rearranging yields (11). 

B. Proof of Proposition 3 

The continuity of /[id, F(w)] and Jr[w. F{w)} is guaranteed by noting that 
K " l [Q(u/)], which exists by assumption, is the production value, Y. that gener¬ 
ates the wage offer w and that (dli/dw)| v . T < () implies that each vacant firm’s 
wage offer is strictly less than its production value. With the initial condition 
F(h) = 0 (by means of propositions 1 and 2; see also n. 10), the Cauchy 
theorem guarantees that a unique solution exists in a neighborhood of (A, 
F(k)). Extension of the theorem for h s w s w is possible because of the 
Lipschitz condition, \9JI8F\ < M, and by noting that F{ui) = 1 for SB < F. 
Hence, this F*(w) qualifies as a cdf on [A, 5?] with F*'(w ) > 0, and along with 
K'(Y) > 0, this satisfies the second-order condition for the maximization of 
IKE, wj in (7) so that (10) is both necessary and sufficient. Q.E.I). 

C. Proof of Proposition 4 

Given that in equilibrium all wage offers are interior optima, the condition 
tPWdw 2 < 0 implies immediately that div*/dY > 0. This can also be derived by 
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noting that, from (11), u/*'(P) can be written as 


w*'(Y) 


K'(Y) . I K(Y) + _ a 2 (r + b)F*(w) _ 

\F*(w) 8(r + b + a){r + b + a [I - P*(u')]} !! 



which is strictly positive on (Y, Y) given Y*'(ui) > 0. 
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The capital asset pricing model provides a theoretical structure For 
the pricing of assets with uncertain returns. The premium to induce 
risk-averse investors to bear risk is proportional to the nondiversifi- 
able risk, which is measured by the covariance of the asset return 
with the market portfolio return. In this paper a multivariate gener¬ 
alized autoregressive conditional heteroscedastic process is estimated 
for returns to bills, bonds, and stocks where the expected return is 
proportional to the conditional covariance of each return with that of 
a fully diversified or market portfolio. It is found that the condi¬ 
tional covariances are quite variable over time and are a significant 
determinant of the time-varying risk premia. The implied betas are 
also time-varying and forecastable. However, there is evidence that 
other variables including innovations in consumption should also be 
considered in the investor's information set when estimating the 
conditional distribution of returns. 
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CAPITAL ASSET PRICING MODEL 


I. Introduction 

The capital asset pricing model (CAPM), originally proposed by 
Sharpe (1964) and Lintner (1965) following the suggestions of mean 
variance optimization in Markowitz (1952), has provided a simple and 
compelling theory of asset market pricing for more than 20 years. In 
its simplest form the theory predicts that the expected return on an 
asset above the risk-free rate is proportional to the nondiversifiable 
risk, which is measured by the covariance of the asset return with a 
portfolio composed of all the available assets in the market. The as¬ 
sumptions implicit in the version of the model discussed here are that 
(1) all investors choose mean-variance efficient portfolios with a one- 
period horizon, although they need not have identical utility func¬ 
tions; (2) all investors have the same subjective expectations on the 
means, variances, and covariances of returns; and (3) the market is 
fully efficient in that there are no transaction costs, indivisibilities, 
taxes, or constraints on borrowing or lending at a risk-free rate. 

Empirical tests of the CAPM have tended to focus on assumption 1 
while strengthening 2 to include the assumption that the common 
distributions are constant over time and that the entire market is the 
market for equities. These tests generally have found that the risk 
premium on individual assets can be explained by variables other 
than the estimated covariance. In particular, the own variance, firm 
size, and the month of January seem to be variables that help to 
explain expected returns. See, for example, Jensen (1972) for a sur¬ 
vey of many of these early studies and Ross (1978), Roll and Ross 
(1980), Chen (1983), and Schwert (1983) for more recent surveys. 

One interpretation for the failure of the CAPM to fully explain 
observed risk premia, due to Roll (1977), is that any empirical 
covariance is computed from an incomplete market for assets. Such 
an objection nearly makes the CAPM untestable. Another explana¬ 
tion is. of course, that alternative theories of asset pricing may be 
supportable such as the arbitrage pricing theory of Ross (1976) or the 
consumption beta formulation introduced by Breeden (1979). 

In this paper we focus attention on the possibility that agents may 
lave common expectations on the moments of future returns but that 
hese are conditional expectations and therefore random variables 
ather than constants. For a discussion along these lines, see also 
Person (1985), Rothschild (1985), and Ferson, Kandel, and Stam- 
laugh (1986). 

Let y, be the vector of (real) excess returns of all assets in the market 
neasured as the nominal return during period t, minus the nominal 
eturn on a risk-free asset, and let |i ( and H, be the conditional mean 
sector and conditional covariance matrix of these returns given infor- 
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tnation available at time l - 1. Also let t be the vector of value 
weights at the end of the previous period so that the excess return on 
the market is defined as y M , = y! -1 • Then the vector of covariances 
with the market is simply H,w f -1 and the CAPM requires 

ft, = 8H,w,- |. (1) 

In this f ormulation, as derived by Jensen (1972), 8 is a scalar constant 
of proportionality, which in equilibrium is an aggregate measure of 
relative risk aversion given by the harmonic mean of the agents’ de¬ 
gree of relative risk aversion weighted by the agents' share of aggre¬ 
gate wealth (cf. Bodie, Kane, and McDonald 1983, 1984). Through¬ 
out the paper we assume 8 to be constant. 

The conditional variance of the market excess return is tr^ 
- <0 ,H and the conditional mean is = co,'_ i which from 
(1) can he written as 

Pm, “ &t m,, (2) 

so that 8 is seen to be the slope of the market trade-off between mean 
and variance. Defining the beta of an asset to be the covariance of that 
asset with the market divided by the variance of the market portfolio, 
0, = H,u», i/o,vr ( , and substituting in (1) and (2) yields the familiar 
expression 

Pi = Pi Pm,. (3) 

Because the covariance matrix of returns varies over time, the mean 
returns and the betas will in general also be time-varying. 

We have stated the CAPM in terms of conditional moments since 
these reflect the information set available to agents at the time the 
portfolio decisions are made. But this model also implies a relation 
between unconditional moments. In the special case in which the 
value weights are fixed, the unconditional means are constant and are 
given by 

£(y,) = 8T(y,)t0 - 8 3 P(H,«»)m. 

Only if T(H,o») = 0 will the unconditional moments satisfy the same 
CAPM relations as the conditional moments. By a similar argument, 
if the econometrician uses only a subset of the relevant conditioning 
information, then the estimated conditional moments will not satisiv 
the CAPM. 

In this paper the conditional covariance matrix of a set of asset 
returns is allowed to vary over time following the generalized auto¬ 
regressive conditional heteroscedastic (GARCH) process (see Engle 
1982; Bollerslev 1986). This essentially assumes that agents update 
their estimates of the means and covariances of returns each period. 
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using the newly revealed surprises in last period’s asset returns. Thus 
agents learn about changes in the covariance matrix only from infor¬ 
mation on returns. There may, of course, be additional information 
relevant to agents' expectations that would lead to misspecification as 
mentioned above. 

The approach is a multivariate generalization of Engle, Lilien, and 
Robins (1987), which treated a single asset, and therefore estimates 
the time-varying risk premium as a function of the conditional vari¬ 
ance of that asset return alone. A similar idea was employed in the 
recent papers by French, Schwert, and Stambaugh (1986) and 
Poterba and Summers (1986). The approach can also be seen as a 
statistical implementation of the intertemporal CAPM of Bodie et al. 
(1983, 1984), in which they had no unknown parameters and no 
statistical test of the model performance. Finally, the paper can be 
viewed as a generalization of Frankel (1985), who assumed that ia, 
may be time-varying but that H, is not, and of Friedman (1985a, 
19856), who allowed H, to be time-varying only because investors 
must learn about the unconditional variance V(y,). 


II. Econometric Methods 

According to the economic model (1), any explanation of time- 
varying expected excess holding yields should be built around a struc¬ 
ture with a time-varying conditional covariance matrix. As mentioned 
above, a model ideally suited for this purpose is the multivariate 
GAKCH in mean (GARCH-M) model. For y, N x 1, the GARCH 
(p, q)-M model takes the general form 

y/ = b -t- + e„ 

9 IP 

vech(H,) = C + ^ A, vech(e,_,€/_,) + ^ B, vech(H/- ; ). (4) 

.=1 /-! 

~ NC O.H,). 

where vech(-) denotes the column stacking operator of the lower por¬ 
tion ol a symmetric matrix, b is an N x 1 vector of constants, e, is an 
N x 1 innovation vector, C is a l /iN(N + 1) X 1 vector, and A„ i = 1, 
■ - • ■ ?, a»d B,, j - 1 ,,p, are VtN(N -1- 1) X VsN(N + 1) matrices. A 
nonzero b vector might reflect a preferred habitat phenomenon or 
differentia) tax treatment of the assets. Of course the GARCH specifi¬ 
cation does not arise directly out of any economic theory, but as in the 
traditional autoregressive moving average time-series analogue, it 
provides a close and parsimonious approximation to the form of 
heteroscedasticity typically encountered with economic time-series 
data (cf. Bollerslev 1986; Engle and Bollerslev 1986). 
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I'he conditional log likelihood function for (4) for the single time 
period / can be expressed as 

L,m = - f log 2ir - Vi log|H,(0)| - */ 2 e ( (e)'Hr‘(»)«,(»). (5) 

where all the parameters have been combined into 0' = (b\ 8, C\ 
vec(A|)', .... vec(A,)', . . . , vec(Bp)’), an to X 1 vector. Thus, 
conditional on the initial values, the log likelihood function for the 
sample 1,.... T is given by 1 

r 

m = X *■/(•>■ ( 6 ) 

1 

As is obvious from (4), (5), and (6), the log likelihood function L(0) 
depends on the parameters 0 in a highly nonlinear fashion, and the 
maximisation of 1,(0) requires iterative methods. The approach taken 
here is to use the Berndt et al. (1974) algorithm along with numerical 
first-order derivatives to approximate dL,(0)/i)0. These derivatives 
provide an added flexibility to changes in the specification. 

Given standard regularity conditions (see Crowder 1976; Wool¬ 
dridge 1986), it follows that the maximum likelihood (ML) estimate 
for 0 will be asymptotically normal and unbiased with covariance 
matrix equal to the inverse of Fisher’s information matrix. Therefore, 
traditional inference procedures are immediately available. In partic¬ 
ular, when equation (4) is tested versus a more general specification, 
the Lagrange multiplier (LM) test statistic takes the well-known form 
LM = T • Rg, where Wq is the uncentered coefficient of multiple 
correlation in the first Berndt et al. iteration for the augmented 
model starting at the ML estimates under the null (cf. Engle 1984). 

As it stands, (4) is very general and involves a total of (N + 1) 
+ */* N(N + 1) + l AN\N + 1 f(p -I- q) parameters. A natural 
simplification is to assume that each covariance depends only on its 
own past values and surprises. Throughout this paper we shall there¬ 
fore take p = q - 1 and impose diagonality on the matrices A] and 
B). With these simplifications, the GARCH(1, l)-M model consid¬ 
ered here becomes 

y„ = b, + 8 £ + fit, 

j 

h.jt - *Yij + <*;>«<(- i*ji -1 + Mu'-1- = 1 . N < < 7 ) 

e,|«k-i ~ W(0, H,), 


1 In practice the presample values are set equal to their expected value, zero. 
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where subscript i refers to the ith element of the corresponding vec¬ 
tor and ij to the i/th element of the corresponding matrix. Thus only 
the own lagged moments and cross products appear in each of the 
conditional covariance equations. 

Model (7) extends the univariate ARCH model introduced in Engle 
(1982) in several directions by allowing for multiple time series, condi¬ 
tional covariance terms in the mean, and own past conditional covari¬ 
ances in each of the covariance equations. 

111. Data Description 

The market portfolio studied in the present paper is composed of 
bills (6-month Treasury bills), bonds (20-year Treasury bonds), and 
stocks. In broad terms, these three assets account for a good part, but 
certainly not all, of the liquid investment opportunities available. The 
data are quarterly percentage returns from the first quarter of 1959 
through the second quarter of 1984, for a total of 102 observations. 
The return on 3-month Treasury bills is taken to represent the risk¬ 
free return For a detailed description of the data sources and data 
transformations, see the Appendix. 

Two data sets have been analyzed for these three returns series. In 
the previous draft of this paper the Standard and Poor’s 500 equity 
series was used with Gitibase interest rates. In this version New York 
Stock Exchange value-weighted equity returns are used with Salomon 
Brothers bill and bond yields. The results are quite similar, so only the 
second data set will be discussed here. The original results are avail¬ 
able from the authors. 

The mean of the excess holding yield on 6-monlh bills over the 
sample is 0.142 percent at a quarterly rate while the standard devia¬ 
tion is 0.356. For bonds the average excess holding yield is —0.761 
percent with standard deviation 6.255, and for stocks the excess hold¬ 
ing yield and the standard deviation are - 0.995 percent and 2.225. 
All the excess holding yield series, however, tend to be somewhat 
erratic. The maximum return on a 3-month balanced portfolio ob¬ 
tained by borrowing at the 3-month rate and lending at the 6-month 
pate was 2.046 percent at a quarterly rate in the second quarter of 
f 1980. On the other hand, the three worst returns occurred in the first, 
[third, and fourth quarters of 1980 with - 0.462, - 0.777, and - 0.515 
percent. For bonds, the best return on a balanced portfolio was also in 
pie second quarter of 1980 with 22.274 percent, whereas the two 
Ivorst returns were in the previous and subsequent quarters with 
F 18-461 and -14.422 percent, respectively. Slocks did best in the 
first quarter of 1975 with 3.746 percent, but two quarters before, the 
Return was as poor as -8.642,quarterly percentage rates. This sug- 
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gests that not only do the conditional mean excess holding yields vary 
over rime, but also the conditional variances seem to be changing 
through time. 


IV. Model Estimates 

In this section we present model estimates for a trivariate CAPM. The 
econometric specification of the model is as in (7). The ML estimates 
(with corresponding standard errors in parentheses) are 
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where i = 1, 2, 3 refers to bills, bonds, and stocks, respectively. 

The estimates for the model are appealing. The estimated value for 
8 = .499 is reasonable and highly significant, lending some support 
for the theory presented here. 

The intercept terms vary substantially across the three assets. Al¬ 
though the theoretical model does not include constant terms in the 
risk premia, these effects are of some importance. The large negative 
intercepts for bonds and stocks are not surprising since reduced capi¬ 
tal gains taxes on long-term assets provide incentives to hold these 
assets even at otherwise unfavorable rates of return. It is also well 
known that bond and equity holders did consistently worse than other 
asset holders over the sample period. The intercepts -4.3 and -3.1 
reflect this fact. 
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Fig. 1.—Risk premia for hills 


The dynamic structure in the second moments for hills and bonds is 
ipparent as rellected by the significant variance and covariance pa- 
ameters. Even though none of the six variance or covariance param- 
■ters for stocks is individually significant at the usual 5 percent level, it 
s interesting to note that the likelihood ratio test statistic for absence 
>f dynamics in the second-order moments for stocks equals 18.639, 
vhich exceeds the .995 value of a xr> distribution, thus soundly reject- 
ng the null hypothesis. Any correctly specified intertemporal asset 
tricing model ought to take this observed heteroscedastic nature of 
isset returns into account. The same point has also been made in the 
ecent papers by Ferson (1985) and Ferson et al. (1986). In particular, 
ests of the CAPM that treat the conditional covariance matrix as 
'mutant over time invariably falter. 

I he estimated risk premia from the model, b, + 5SA ly ,o> ; ,_ t , are 
dotted in figures 1-3 along with the excess holding yields for each of 
he three assets. Figures 1 and 2 show that the estimates for bills and 
Kinds are fairly similar except for a difference in scale. Both assets 
iave rising risk premia during the volatile post-October 1979 period, 
t is reasonable to believe that, on average, investors were paid a 
lositive premium for holding bills or bonds during this period. Note 
hat the negative premia observed for bonds and equities in some 
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periods could be due lo the preferential tax treatment as previously 
mentioned. 

Plotted in figures 4-6 are the estimated betas. Not surprisingly, the 
beta for stocks is close to one, that for bonds is slightly above one, and 
that for bills is close to zero. There is, however, substantial movement 
over the sample period. 


V. Diagnostic Tests 

Given the evidence above, the trivariate CAPM seems to fit the data 
reasonably well. However, in order to assess the general validity of the 
model, a series of LM tests were performed. We shall consider here 
only a very small subset of these. 

The first LM test involves the inclusion of the own conditional 
variances in each of the three equations for the conditional expecta¬ 
tion of the excess holding yields. The test statistic equals 1.148, which 
is asymptotically a xs random variable if the null hypothesis is true. 
Thus the null cannot be rejected at any level under 23 percent, lend¬ 
ing further support to the model. This test is of particular interest 
since in tests of the time-invariant CAPM the own variance is often 
found to be highly significant. This might also provide an explanation 
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for the empirical findings in French et al. (1986) and Poterba and 
Summers (1986), where a time-varying measure of the own condi¬ 
tional variance or volatility is found to have little explanatory power 
for the expected return on the stock market. Our results suggest that 
a better measure might be the nondiversifiable risk as given by the 
conditional covariance with the market. 

The next test considers the lagged excess holding yields as explana¬ 
tory variables for each of the three risk premia. This test rejects the 
formulation of the CAPM given in (8). The value of the test statistic is 
18.311 and is highly significant at any reasonable level in the corre¬ 
sponding xi distribution. Thus agents may use information in addi- 
lion to past innovations in forming their expectations. This ability of 
the lagged dependent variable to help forecast returns is not all that 
surprising in view of other recent results in the literature (see, e.g., 
Campbell 1987). 

One of the competing theories of the intertemporal CAPM pre¬ 
sented here is the consumption beta formulation mentioned in the 
Introduction. It is therefore interesting to note that the lest for inclu¬ 
sion of innovations in the logarithm of per capita consumption in the 
conditional mean equals 14.027, the value of a Xs random variable in 
the absence of any correlation. This surprising correlation of innova¬ 
tions in consumption and innovations in asset returns suggests that 
reformulation along the lines of a consumption beta model might 
deserve some consideration. 2 However, the results in Hansen and 
Singleton (1982,1983) do not lend much support to a strict formula¬ 
tion of that model. Also, Mankiw and Shapiro (1986) reported find¬ 
ings in favor of the traditional CAPM versus the consumption beta 
formulation. 

VI. Conclusions 

In summary, the results reported in this paper support several con¬ 
clusions. First, the conditional covariance matrix of the asset returns is 
strongly autoregressive. The data clearly reject the assumption that 
this matrix is constant over lime. The expected return or risk premia 
for the assets are significantly influenced by the conditional second 
moments of returns. There is also some evidence that the risk premia 
are better represented by covariances with the implied market than by 
own variances. However, information in addition to past innovations 


2 The test probably has too small a size since, even if the CAPM were true, there 
might be consumption out of portfolio wealth, which would lead to a rejection. The 
rejection occurs only when future consumption is included, and this is. of course, not in 
the agents' information set. The 1.M test statistic for innovations in current consump¬ 
tion takes the value S.2BU corresponding to the .05 tractile in a Xs distribution. 
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in asset returns is important in explaining premia and heteroscedas- 
ticity. In particular, lagged excess holding yields and innovations in 
consumption appear to have some explanatory power for the asset 
returns. 

Probably even better econometric models with a richer specification 
for the risk, premia, not necessarily derived directly from any eco¬ 
nomic theory, can therefore be constructed. See Hansen and Hodrick 
(1983) for a discussion along these lines. Other interesting questions 
that remain are the sensitivity of the results to the choice of the “mar¬ 
ket portfolio” and a quarterly one-period horizon. It is possible that 
wider definitions of the market would allow the model to do better. 
We leave the answer to all these questions for future research. 


Data Appendix 

The yields on 3-month Treasury bills, 6-month Treasury bills, and 20-year 
Treasury bonds were all taken from the Salomon Brothers’ Analytic Record of 
Yields and Yield Spreads. The yields are percentages per annum for the first 
trading day in January, April. July, and October. The returns were converted 
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to quarterly rates r{, r, b,u , and r, b<md , where (1 + R,) u * = 1 + r,. From these 
rates the one-quarter excess holding yields were calculated as 


r = ioo[ (1+r ' bi, > a - 1 -4 

- 1 + r( +1 J 

(Al) 

f/ -bond \ .1 

y, bo,,d - 100 |j ' j + r b,,nd - 1 r{ j. 

(A2) 


The stock market yields were based on the return on the value-weighted New 
York Slock Exchange index including dividends obtained from the Center 
for Research in Security Prices at the University of Chicago. From the quar¬ 
terly flow of returns, r* ,ock , the one-quarter excess holding yield was simply 
calculated by 

y“" tk = 100Of"* - r[). (A3) 

The maturity distribution of the interest-hearing public debt held by pri¬ 
vate investors was taken from the Federal Reserve Bulletin and, for 1976 until 
the present, from the Treasury Bulletin. To get from par values to market 
values, we multiplied the outstanding debt in the different maturity catego¬ 
ries within 1 year, 1-5 years, 5-10 years, and 10 years and over by the price 
indices reported in Cox (1985). The categories were then added together to 
get the market values within 1 year, bills, and longer than 1 year, bonds. The 
lotal market value of corporate equities was obtained from a special tabula¬ 
tion of the balance sheets of the flow of funds accounts by the Board of 
Governors of the Federal Reserve System. The relative market values are 
illustrated in figure Al. 

Finally, the data for personal consumption expenditure on nondurables in 
1972 dollars, C„ were obtained from Citibank Economic Database and from the 
Survey of Current Business. 
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This paper provides a theory of legislative institutions that parallels 
the theory of the firm and the theory of contractual institutions. Like 
market institutions, legislative institutions reflect two key compo¬ 
nents: the goals or preferences of individuals (here, representatives 
seeking reelection) and the relevant transactions costs. We present 
three conclusions. First, we show how the legislative institutions en¬ 
force bargains among legislators. Second, we explain why, given the 
peculiar form of bargaining problems found in legislatures, specific 
forms of nonmarket exchange prove superior to market exchange. 
Third, our approach shows how the committee system limits the 
types of coalitions that may form on a particular issue. 


The organization of Congress meets remarkably well 
the electoral needs of its members. To put it another 
way, if a group of planners sat down and tried to design 
a pair of American national assemblies with the goal of 
serving members’ electoral needs year in and year out, 
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they would be hard pressed to improve on what exists. 

[Mayhew 1974, p. 81] 

The new economics of organization holds that explicit market ex¬ 
change is not the universally ideal institution for a transaction. The 
most successful application of this approach, the theory of the firm, 
attempts to explain, for example, why some transactions take place 
within a firm under certain circumstances and across a market (e.g., 
between firms) under others. 1 This theory also focuses on the struc¬ 
ture of the corporation, notably the separation of ownership and 
control (Alchian and Demsetz 1972; Jensen and Meckling 1976; 
Fama 1980; Fama and Jensen 1983; Demsetz and Lehn 1985; Gross- 
man and Hart 1986). With few exceptions, however, it has not consid¬ 
ered other types of organizations, such as public bureaucracies, polit¬ 
ical parties, or legislatures. 2 The purpose of this paper is to extend 
this theory to the study of political organizations and, in particular, to 
explain the pattern of institutions within the legislature that facilitates 
decision making. 

Studies of public policy-making emphasize the dependence of polit¬ 
ical decisions on interest group and constituency participation. While 
this approach is consistent with outcomes in many individual policy 
areas, it fails to explain how so many diverse interests are provided 
with policy benefits simultaneously. A huge variety of interests are 
represented in the legislature, and almost none is represented by a 
majority. For most interests to gain policy benefits, representatives 
with different constituents must agree to exchange support. Put an¬ 
other way, the diversity of interests creates gains from exchange 
within the legislature. While the literature implicitly assumes that 
these gains are captured, it fails to explain how trades are accom¬ 
plished and enforced. If public policy reflects a series of bargains 
among various interests, how are these bargains maintained over 
time? As we know from the modern literature on contracts, the an¬ 
swer to this question is not always straightforward since not all agree¬ 
ments are enforceable. 


1 typical applications locus on the various forms of vertical relations (Coase 1937; 
Williamson 1975, 1985; Klein, Crawford, and Alchian 1978). Besides these more gen¬ 
eral treatments of vertical integration, there are excellent treatments of other forms ol 
vertical relations such as franchising (Rubin 1978), resale price maintenance (Cilligan 
1986), and long-term contracting (Joskow 1985). 

The exceptions include Goldberg (1976), Moc (1984), Wcingast (1984). Miller and 
Moe (1986), firole (1986), Milgrom and Roberts (1987). and some of the topics in 
North (J98I). The program for wide application of the approach is discussed in Jensen 
(1983). Fama and Jensen (1983) extend the analysis of market organizations to include 
some nonprofit ones, though their analysis only begins the study of this important 
category of widely different organizations. 
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To address these issues, we develop a theory of legislative institu¬ 
tions that parallels the theory of the firm and the theory of contrac¬ 
tual institutions. Like market institutions, legislative institutions 
reflect two key components: the goals and preferences of individuals, 
here legislators seeking reelection from their constituents, and the 
transactions costs that are induced by imperfect information, oppor¬ 
tunism, and other agency problems. But the enforcement mecha¬ 
nisms underpinning exchange in market settings are typically un¬ 
available to or inappropriate for the legislature. Solutions to con¬ 
tractual problems that arise in the market (e.g., vertical integration) 
do not directly translate into solutions to similar problems found in 
legislatures. We show how the legislative institutions enforce bargains 
among legislators and why, given the peculiar bargaining problems 
found in legislatures, specific nonmarket exchange mechanisms 
prove superior to market exchange. From a policy perspective, these 
institutions have important implications. Durability of bargains leads 
both to the durability of policies that these bargains are designed to 
implement and to the coalition supporting these policies. Our model 
thus has important implications for coalition formation and mainte¬ 
nance. 

Section f summarizes the new economics of organization. Section II 
begins ihe analysis by presenting several assumptions on which our 
approach is based. Section II describes models of the market for votes 
and focuses on enforcement problems. Section IV presents our the¬ 
ory of legislative institutions and suggests why these institutions solve 
problems that arise in simple markets. Section V provides empirical 
evidence on several propositions that follow from our model. This 
evidence, from a variety of contexts involving the U.S. Congress, 
provides significant support for the model. Section VI derives some 
comparative static results that provide some additional evidence for 
the approach and suggest some important avenues for additional 
tests. A discussion section, Section VII, follows in which we explore 
alternative explanations for enforcing legislative exchange along with 
possible extensions of our approach. 


I. The New Economics of Organization 

The theory of the firm holds that production and exchange take place 
through institutions (contractual patterns, organizational forms) that 
reflect the specific pattern of transaction costs found in trade. The 
emphasis of this theory is on how specific organizational or contrac¬ 
tual forms reduce these costs. Some of the important results from this 
literature will prove useful in our discussion of legislatures. 

The seminal paper in this tradition (Coase 1937) asserts that the 
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firm emerges not simply to take advantage of specialization or econo¬ 
mies of scale but to avoid the costs of using markets and the price 
system: “The main reason why it is profitable to establish a firm would 
seem to be that there is a cost of using the price mechanism. The most 
obvious cost of ‘organising’ production through the price mechanism 
is that of discovering what the relevant prices are” (p. 390). In other 
words, the firm provides a set of contractual mechanisms that substi¬ 
tutes for the price mechanism, in part because the price mechanism is 
too costly to use in certain circumstances. 3 

A major theme in the literature is that the institutions of the firm 
are designed, in part, to reduce the costs of assuring contractual per¬ 
formance. In the words of Williamson (1985, pp, 48-49), “Transac¬ 
tions that are subject to ex post opportunism will benefit if appropriate 
safeguards can be devised ex ante. Rather than reply to opportunism 
in kind, therefore, the wise [bargaining party] is one who seeks both 
to give and receive ‘credihle commitments.’ Incentives may be re¬ 
aligned, and/or superior governance structures within which to orga¬ 
nize transactions may be devised.” This principle is one of the central 
lessons of this body of work; it underlies much of institutional and 
organizational design. 1 

The costs of assuring contractual performance are high in a variety 
of circumstances. Two settings concern us. T he first centers on prob¬ 
lems of observability (Holmstrom 1979) or measurement (Barzel 
1982), for example, when it is difficult lo separate out an agent’s 
contribution from that of random events or when an agent has pri¬ 
vate information about, say, the quality of the good being sold. Im¬ 
perfect observability generates well-known problems such as moral 
hazard, adverse selection, and shirking that plague simple spot mar¬ 
ket exchange. A large part of the literature spells out ex ante contrac¬ 
tual forms designed to mitigate these problems. The second setting 
centers on incomplete contracts, for example, when it is impossible 
(or too costly) for contracting parties to plan for all possible contin¬ 
gencies. Several scholars have studied these settings and the attendant 
problems of ex post opportunism that arise when ex post incentives of 
the bargaining parties are inconsistent with performing ex ante 
agreements (e.g., Klein et al. 1978; Kreps 1984; Williamson 1985; 
Grossman and Hart 1986). Those works also study a variety of mecha¬ 
nisms that are used to mitigate these problems, typically some form of 
vertical relations. 


\ s , et a,so dw discussion in Cheung (1983). 

Virtually every paper cited on the theory of the firm makes this argument. For 
particular details, see, e.g., Barzel (1982), Fa'ma and Jensen (1983), Kreps (1984), or 

Williamson (1985). r 


* < yiti. 
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We emphasize that the literature is not simply an analysis of con¬ 
tractual failures. As suggested by Williamson in the quote above, ex 
post problems lead to the design of organizational forms to mitigate 
these problems. The literature on vertical integration, for example, 
argues that this organizational form is largely an endogenous re¬ 
sponse to ex post contractual problems of the sort we have just men¬ 
tioned. This example illustrates the argument that a particular form 
of internal organization proves superior to market exchange. 

A major limitation of the new economics of organization is that it 
remains largely tied to market settings. Though the principles are 
obviously more general (as clearly articulated in Jensen f 1983] or 
Milgrom and Roberts [1987]), applications to other settings are just 
beginning. Indeed, developing a general theory of organizations re¬ 
quires effectively applying this theory to types of organizations be¬ 
yond those included in the set studied to generate it. 


II. Representatives and Their Constituencies 

In this paper, we take up this challenge by showing how this approach 
illuminates phenomena that take place in legislatures. I'he perspec¬ 
tive developed in this paper rests on three assumptions. 

Assumption 1. Congressmen represent the (politically responsive) interests 
located mthin then district. —While rational ignorance pervades the 
political system, that does not imply that the interests of constituents 
are irrelevant for representatives or that the latter are free to pursue 
their own interests. Rather, rational ignorance underpins interest 
group advantage in politics. Because most voters have only a dim 
awareness of an incumbent’s actions, rational ignorance biases polit¬ 
ical response toward those who do form impressions. Thus interest 
groups, because they have greater individual stakes in particular 
issues, monitor congressmen and provide them with information. 
Groups also mobilize their members in support of friendly congress¬ 
men. 

Interest groups are not uniformly distributed. They typically have 
concentrations of voters in particular locations. Farm organization 
members, for example, are concentrated in specific districts; so too 
are consumers of food stamps and members of welfare rights organi¬ 
zations. The elderly, to take another example, have a disproportion¬ 
ate presence in Florida and Arizona (medicare and social security) 
while miners are found in West Virginia, Pennsylvania, and southern 
Illinois (mine safety, black lung disease). 

In the competition for interest group support, specific representa¬ 
tives have a comparative advantage. The lack of complete fungibility 
of votes implies that legislators are advantaged in attracting support 
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from interest groups located in their district (see Denzau and Munger 
1986). This advantage arises because service to local interests attracts 
both votes and organized resources for the district’s representative. 
Service to this group by an outsider, in contrast, attracts only the latter 
and may lose votes. 

Electoral competition induces congressmen, at least in part, to rep¬ 
resent the interests of their constituents. Because groups are not uni¬ 
formly distributed across constituencies, different legislators repre¬ 
sent different groups. 5 

Assumption 2. Parties place no constraints on the behavior of individual 
representatives. —Parties were strong around the turn of the century 
| when they possessed reward systems and sanction mechanisms to con¬ 
trol the behavior of members. Specifically, party organizations deter¬ 
mined entry into competition for the local seat, the positions of power 
within the legislature, and the distribution of legislative benefits (e.g., 
a representative obtained legislative benefits only if he supported 
party measures). None of these conditions now holds. In what follows, 
t we therefore treat the individual as the decision-making unit. 0 
| Assumption 3. Majority rule is a binding constraint. —Proposed bills 
j (alterations in the status quo) must command the support of a major- 
| ity of the entire legislature in order to become law. 

i 

! 

\ 

l III. The Gains from Exchange: The Problem 
[ to Be Solved 

| Legislators pursue their reelection goals by attempting to provide 
benefits to their constituents (assumption 1). Acting alone, they can¬ 
not succeed (assumption 3). This, in combination with the diversity of 
interests they represent, generates gains from exchange and cooper¬ 
ation among legislators. But what institutions underlie—and en- 
f force—this cooperation? 

5 Fvidence for this view abounds in the literature. For a recent summary in the 
political science literature, see Fiorina (19816). In the economics literature, systematic 
evidence has been provided as part of the controversy over ideological voting in Con¬ 
gress. While the empirical issue concerns the degree to which representative behavior 
can diverge from constituents’ interests, all studies provide substantial evidence that the 
latter systematically—though not necessarily completely—affects congressional voting 
(see Kau and Rubin 1979; Halt and Zupan 1984; Peltzman 1984). 

Substantial evidence for this assumption is provided in the political science litera¬ 
ture (see, e.g., Mayhew 1966). To take one example; the whip system, once a tool of the 
leadership to keep party members in line, now operates as a service organization pro- 
* vjding information to the leadership and to the members. To quote one popular text on 
Congress, it “operates not as much as a device to coerce or even persuade members as it 
, does simply to inform the leadership of the disposition of members toward legislation" 
(Polsby 1984, p. 129). 
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The new economics of organization suggests that institutions evolvt 
to ensure delivery of benefits. In order to understand why one ex 
change mechanism survives instead of another, we need to study tht 
potential agency and transactions cost problems faced by legislators 
given the types of trades they seek to make. It is useful to begin b) 
focusing on previous approaches to legislative exchange that explic¬ 
itly rely on marketlike mechanisms. By studying the enforcemem 
problems encountered in this setting, we can determine the character 
istics a more appropriate legislative exchange mechanism must pos¬ 
sess. 

Previous work has focused on vote trading, also known as logrolling 
centralized legislative exchange, or legislative lOUs. The major pro¬ 
ponents of particular versions include Tullock (1967, 1981), Wilsor 
(1969), Telser (1980), Koford (1982), and Becker (1983). While there 
are significant differences among these approaches, fundamental tc 
each is an explicit or implicit market in votes. Under the most well 
known logrolling version, legislators begin with proposals to benefit 
themselves at the expense of others, but none of these proposal* 
commands a majority (Buchanan and Tullock 1962; Tullock 1967 
1981). Legislators therefore search out trading partners. In exchange 
for support, each gets his proposal passed and, on net, is better off. In 
the explicit market versions, votes are bought and sold for a price 
with the “equilibrium" prices determining vote trades and hence the 
set of bills passed (see also Wilson 1969; Koford 1982). 

The motivation underlying these market models is clear. By giving 
away votes on issues that have lower marginal impact on their district 
(and therefore on their electoral fortunes) in exchange for votes on 
issues having a larger marginal impact, legislators are better off. 
Whether or not they incorporate an explicit auction, models of the 
legislative market for votes have considerable appeal. 

A careful inspection, however, reveals that this approach assumes 
away some of the deepest problems plaguing legislative exchange. It 
assumes, for example, that all bills and their payoffs are known in 
advance; that is, there are no random or unforeseen future events 
that may influence outcomes or payoffs. Either the time dimension is 
suppressed or enforcement of agreements over time is left exoge¬ 
nous. Because these models study a legislature with no future, they 
cannot address how legislators cope with agreements that cover more 
than one legislative session. 

A variety of exchange problems arise because the value of today's 
legislation significantly depends on next year’s legislative events. 
Members of future sessions face incentives different from those faced 
when the trade occurred and may seek, for example, to amend, abol¬ 
ish, or simply ignore previous agreements. Because current legislators 
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typically cannot bind a future legislative session, problems of enforce¬ 
ment over time are critically important for understanding legislatures 
and cannot be assumed away. Moreover, as we will see, these settings 
inhibit the ability of noninstitutional enforcement of cooperation 
(e.g., reputation) as the sole means of policing bargains. In the face of 
uncertainty over the future status of today’s bargain, therefore, legis¬ 
lators will devise institutions for long-term durability of agreements 
that ensure the flow of benefits beyond this session of the legislature. 

To begin our analysis, we observe that most models of the legisla¬ 
tive market apply to only a subset of problems faced by legislators, 
typically the pork barrel. Pork barrel programs are an important part 
of every major Western government, but they have special character¬ 
istics that do not hold for other types of legislation. For example, 
benefit flows are contemporaneous to different legislators (in this 
case, the funds financing the project), and consummation of trading is 
simultaneous (see, e.g., Buchanan and Tullock 1962; Tullock 1981; 
Koford 1982). Focusing solely on pork barrel-type programs rules 
out virtually all the important issues studied in the regulatory litera¬ 
ture as well as the major U.S. redistributive programs. 7 We consider 
the problems generated by noncontemporaneous benefit flows and 
nonsimultaneity in turn. 

A. Noncontemporaneous Benefit Flows 

To see how differential patterns of benefit flows potentially inhibit 
trading, consider the following exchange problem. Suppose that a 
group of legislators seeking pork, for example, dams and bridges, 
attempts to find some other group of legislators with whom to ex¬ 
change votes. Suppose further that one potential set of trading part¬ 
ners is a group of legislators who seek a flow of services from a 
regulatory agency. If the two sides exchange votes, the first group 
obtains its dams and bridges while the second obtains its regulatory 
agency. Once the dams are built, however, what stops the first group 
from reneging on the agreement, for example, from working during 
a future legislative session to revoke the regulatory benefits? Simple 
market exchange institutions do not adequately protect against this 
form of reneging (and, as we will see, repeated interaction alone is 
insufficient to prevent this problem). Rational coalition partners, 
therefore, discount the potential gains from a proposed trade by the 
probability that these benefit flows will be curtailed by reneging. Con¬ 
sequently, the second group of legislators might not accept the trade 


7 For several surveys in this literature, see the articles in Fromm (1981). 
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toral effects of this change are observable solely by the legislator it 
affects, the first legislator may argue that, while he could support the 
original bill, he cannot support the new version. On the other hand, 
the drafters of the legislation, having gained additional support 
through trades, may opportunistically rewrite the legislation so as to 
increase their own benefits (and impose greater costs on others). 

Trading in legislative IOUs thus poses considerable contractual 
problems of the sort studied in the theory of the firm. Either IOUs 
must be for a specific form of a bill without any alterations or they 
must provide for hundreds of contingencies, many of which are not 
observable to both parties. Neither form of IOU is likely to prove 
useful. The former severely limits the trading possibilities. Since most 
legislation is altered at several stages before it is passed, this form of 
IOU exchanges one vote for sure against one vote under relatively 
rare circumstances—an unlikely basis for a transaction. 11 Further, 
different contingencies are important to different legislators, and the 
market for specific, contingent IOUs is likely to be extremely thin, 

| perhaps requiring a different price for each potential trade. As Coase 
| (1937) observed, this obviates the benefits of a price system. But per- 
Uiaps more important, the observability problems associated with 
many contingencies suggest that IOUs are unenforceable: how are 
jthe parties to agree ex post when the number of possible events is 
llarger than the number of specified contingencies and when both 
barties cannot observe the outcome? 

I This discussion reveals that market forms of exchange are limited 
|ts a means of capturing the gains from trade. As noted in Section I, 
■Problems with observability and ex post enforceability are fundamen- 
Bpl to understanding the motivation for internalizing a transaction 
with a firm. Just as these problems lead to the emergence of vertical 
Integration to replace market exchange, they motivate the design of 
Kstilutions within the legislature that substitute for explicit market 
Btchange. 

Kin the discussion so far, there has been little mention of the role of 
■peat play. Repeated interaction provides incentives for individuals 
■ adhere to agreements this period so as to maintain a flow of bene- 
H? over time. 12 This form of endogenous cooperation surely plays a 

Sec ferejohn (1974ft) for a further exploration of the peculiar properties ot a 
^■ket in voles. This stems in part from results in the collective choice literature that 
sB* dial when one set of vote trades is feasible, so are many others (e.g., Schwartz 
^B). This prevents the logic ol the standard arguments about supporting price sys- 
|B tronj holding in this context. 

|^(See, e.g., Axelrod (1984) and Calvert (1985). There is, of course, a growing lilera- 
«onomics on this topic (e.g., Telser 1980; Klein and Leffler 1981; Kreps and 
*182; Roberts 1986). A further problem limits the workability of this solution, 
^Hoi leirnUiiv,, turnover. Even in current times when incumbents are reelected with 
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role in legislatures, and for some settings, it alone may be sufficient to 
police bargains. It is well known, however, that “the long arm of the 
future" is inadequate in settings in which agents have private infor¬ 
mation and in which it is impossible or too costly to specify ail contin¬ 
gencies in advance. 13 It is precisely these problems that we have ar¬ 
gued motivate the need for alternative legislative institutions. The 
importance of unanticipated contingencies in both noncontem- 
poraneous and nonsimultaneous trading combined with private in¬ 
formation and moral hazard in the latter suggests the need for addi¬ 
tional mechanisms to maintain bargains. 

Perhaps another way of putting the argument of this section is as 
follows. Repeat play alone is insufficient to prevent the breakdown of 
cooperation under certain circumstances. legislators therefore have 
an incentive to devise institutions that reduce the circumstances in 
which breakdown occurs. In this sense, legislative rules are not subsd- 
tutes for reputation building and trigger strategies commonly used in 
repeat play. Rather, rules complement the use of these strategies and, 
in particular, prevent the breakdown of cooperation at precisely the 
circumstances under which these other strategies fail. 

This argument closely parallels that of vertical integration in which 
reputation effects are also insufficient to police cooperation between 
firms. In both cases, potential contractual problems lead to the design 
of institutions that substitute for market exchange; in so doing, they 
improve ex post enforceability of agreements. This does not imply 
that reputation building is unimportant in legislatures or in firms that 
are vertically tied, just that it is not the sole means of enforcing agree¬ 
ments. Indeed, the other institutions of the legislature undoubtedly 
facilitate its use as a means to complement other devices. 

C. Implications 

Problems concerning the durability and enforceability of bargains are 
ubiquitous in legislative settings, limiting the value of explicit market 
forms of exchange. H Put another way, coalitions lack durability under 


high frequency, the average net turnover in Congress is 10 percent per term. More¬ 
over, the losers are typically replaced with members with different preferences if only 
because the latter, in order to beat the former, had to devise a separate support con¬ 
stituency. 

13 The literature on the theory of the firm is built on the premise that the incentives 
derived from repeat dealings alone are insufficient to police incentive problems. Exam¬ 
ples are the vertical integration or the optimal structure of financial claims. See the 
references in n. 2. 

14 Moreover, the problem of non-pork barrel programs and lack of simultaneity do 
not exhaust the situations in which a legislative market is a poor provider of durability. 
For example, even if two groups of legislators both seek permanent regulatory benefits, 
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an explicit market exchange system. In the face of these problems, 
legislators will devise alternative institutions that provide exchanges 
with a greater degree of durability (see Ferejohn 1986). We now turn 
to a discussion of how this is accomplished. 

IV. The Legislative Committee System 

This section develops a model of an idealized legislative committee 
system. The types of policies (i.e., legislative bargains) that emerge 
from this model parallel those predicted by the vote-trading models; 
but it is not plagued by problems of enforcement of exchanges. The 
legislative committee system is defined by the following three con¬ 
ditions. 

Condition 1. Committees are composed of a number of seats or 
positions, each held by an individual legislator. Committees possess 
the following properties: (a) associated with each committee is a 
specific subset of policy issues over which it has jurisdiction (e.g., 
commerce, energy, banking, or agriculture); (A) within their jurisdic¬ 
tion, committees possess the monopoly right to bring alternatives to 
the status quo up for a vote before the legislature; and (c) committee 
proposals must command a majority of votes against the status quo to 
become public policy. 

Condition 2. There exists a property rights system over committee 
seats called the "seniority system.” It has the following characteristics: 
(«) a committee member holds his position as long as he chooses to 
remain on the committee; subject to his reeleclion, he cannot be 
forced to give it up; (A) leadership positions within the committee 
(e.g., chairmanship) are allocated by seniority, that is, the length of 
continuous service on the committee; (c) rights to committee positions 
cannot be sold or traded to others. 

Condition 3. Whenever a member leaves a committee (e.g., by 
transfer, death, or defeat), his seat becomes vacant. There is a bidding 
mechanism whereby vacant seats are assigned to other congressmen. 

Condition 1 defines the source of committee power and value, con¬ 
dition 2 defines the property rights system associated with committee 
positions, and condition 3 establishes an exchange mechanism over 
the rights established under 1 and 2. 

Let us explore the consequences of the legislative committee system 
to determine its enforcement properties, how new policies are pro¬ 
changing electoral fortunes may promote growth in one and shrink the other; to the 
extent that this change appears reasonably permanent, it provides the conditions fos¬ 
tering a revocation of the latter group’s benefits. When the once and for all gains 
exceed the cost potentially imposed by the (now smaller) other side, reneging is likely to 
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vided, its control of the agency problems that arise from the delega¬ 
tion of power to a particular subset of members, and the types of 
policies that are likely to emerge from it. 

A. Enforcement of Legislative Bargains 

The committee system provides substantial protection against oppor¬ 
tunistic behavior, thereby providing durability to policy bargains. To 
see this, consider the setting described above in which one group of 
legislators seeks dams and bridges and the second seeks a regulatory 
agency benefiting its constituents. In the legislative market, this agree¬ 
ment is vulnerable to ex post reneging of the following form: the first 
group, after building its dams, might form a coalition with other 
legislators (perhaps the minority excluded from the original deal) to 
pass a new bill revoking the regulation benefiting the second group. 

But now consider the same bargain assuming that it was forged 
under the committee system and that the first group controlled the 
committee with jurisdiction over pork barrel programs, the second, 
the committee with the jurisdiction over the relevant regulations. 
Under the committee system, the second group retains control over 
the agenda within its jurisdiction. Suppose that, once the dams and 
bridges are completed, the first group introduces legislation to revoke 
the benefits flowing to the second group, and, further, a majority 
supports this legislation. However, only the committee with jurisdic¬ 
tion can bring it to the floor for a vote. This control over the agenda 
within its jurisdiction implies that a committee has veto power over 
the proposals of others. Since this proposal would make the commit¬ 
tee worse off (and since, by assumption, a majority will support it on 
the floor), the committee would not allow it to come up for a vote. In 
other words, the restricted access to the agenda serves as a mechanism 
to prevent ex post reneging. 

Moreover, because exchanges in influence are institutionalized 
through the property rights system, the absence of simultaneity is 
considerably less troublesome. As long as the property rights system is 
maintained, the agenda power held by each committee substitutes for 
outstanding IOUs with uncertain contingencies. The problems associ¬ 
ated with devising contingent claims over future events are relatively 
absent under the legislative committee system. 

B. Providing New Benefits (or How Committees Capture 
the Gains from Exchange) 

The agenda rights afford committee members considerable influence 
over policy choice within their jurisdiction. This follows because the 
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set of points that command a majority against any given status quo, 
W(sq), is generally quite large (McKelvey 1976, 1979; Shepsle and 
Weingast 1981). Typically, W(sq) includes a wide range of policy alter¬ 
natives, some making committee members worse off and some mak¬ 
ing them better off. Given this range of alternatives, agenda power 
allows committees to bias the outcome in favor of the alternative they 
most prefer. 15 

The committee system institutionalizes a trade among all the legis¬ 
lators, policy area by policy area, for the right to select which points 
from W(sq) replace the status quo. But this is neither accomplished 
nor enforced by an explicit market exchange. Rather, a legislator on 
committee i gives up influence over the selection of proposals in the 
area of committee) in exchange for members of committee ft giving 
up their rights to influence proposals in area i. Institutionalizing 
rights over agenda power—that is, control over the design and selec¬ 
tion of proposals that arise for a vote—substitutes for purchasing the 
votes of others in an explicit market. Since any element of W(sq) will 
pass by definition, it is the influence over elements of this set afforded 
committees by agenda power that eliminates the need for explicit 
exchange of voles. 

C. Who Gains Influence (or How Are. the Gains from 
Exchange Distributed) ? 

This question concerns the types of policies chosen under the com¬ 
mittee system. Since committees afford their members disproportion¬ 
ate influence over policy choice within their jurisdiction, it also con¬ 
cerns the mechanism that assigns legislators to committees. 

Condition 3 provides that the legislature uses a bidding mechanism 
to assign members to committee positions. Since a representative’s 
electoral fortunes depend on his obtaining benefits for his constitu¬ 
ents and since constituent interests differ, legislators seek assignment 
to those committees that have the greatest marginal impact over their 
electoral fortunes. The real opportunity costs of bidding for commit¬ 
tee i are that the representative gives up the possibility of holding a 
seat on committee j. Thus representatives from farm districts are 
much more likely to bid for seats on agriculture committees than they 
are for seats on urban, housing, or merchant marine committees. A 
potential problem arises, however, because some committees are 
valued by all (e.g., the spending or taxing committees). However, here 
too the bidding mechanism determines assignment. The more com- 


The details of this process are beyond the scope of this paper. For an in-depth 
analysis, see Shepsle and Weingast (1984, 1987). 
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petition for seats, the less likely the bid will be successful. Suppose 
each potential bidder for a highly valued committee (e.g., one con¬ 
cerning taxes) also values some specific policy committee with much 
less competition (e.g., housing, agriculture, or public works). 7’he 
increased competition for seats on the tax committees implies that 
only those with the greatest differential value between the tax com¬ 
mittee and their next-best alternative will pay the opportunity cost of 
bidding (i.e., giving up a higher probability of getting their policy 
committee). 

I). Implications for Coalition Formation 

The legislative committee system has two separate effects on coalition 
formation. First, agenda power held by committee members implies 
that successful coalitions must include the members of the relevant 
committee. Without these members, the bill will not reach the floor 
for a vote. This, in turn, implies that certain policies arc unlikely to 
become law, for example, those that provide l>enefits only to a major¬ 
ity off the committee. In technical terms, committee veto implies that, 
from among the set of policies that command a majority against the 
status quo, only those that make the committee better off are possible 
(this issue is extensively explored in Shepsle and Weingast [1987|). 
This significantly reduces the feasible set of policies that may be im¬ 
plemented. 

Along these lines, we also note that since committees have rights to 
bring a single bill to the floor, trades among committee members are 
more likely to succeed than those across committees. This follows 
because there is less chance for such a deal to fall apart. When a 
coalition forms between members of two committees, legislators must 
agree to exchange votes on two separate bills. When a coalition forms 
among members of the same committee, they may bring a single bill 
to the floor. The latter allows a single up or down vote on the package 
(whereas the former does not), thereby affording less chance for re¬ 
neging. This suggests that drawing the jurisdictional boundaries be¬ 
tween committees is an important strategic variable that affects the 
pattern of coalitions. 16 Ceteris paribus, expected trading partners are 
better off if they are members of the same committee so that the 


19 See Ferejohn (1986) for a discussion of this issue in the context of a trade between 
the urban members on the Agriculture Committee (seeking food stamps) and the 
farmers on this committee (seeking continued farm benefits). He argued that being on 
the same committee advantaged these urban members over other potential legislative 
partners who were part of other committees that might have brought some other form 
of legislation providing some subsidy for food for the poor (the latter could have easily 
been written by, e.g.) Ways and Means). 
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optimal pattern of jurisdictions must in part reflect the expected pat¬ 
tern of trades. 

The second effect on coalitions concerns durability. The durability 
afforded by the committee system induces some rigidities into the 
coalition formation process. Under a market exchange mechanism, 
small changes in political circumstances would lead to a small change 
in the optimal set of bargains and coalitions. But under the committee 
system, small changes in circumstances do not automatically lead to 
changes in policy. To see this, consider the example explored above 
involving dams, bridges, and regulatory benefits. We showed above 
that committee veto power prevents the proponents of dams from 
easily reneging once their dams are built or if, because of a change in 
political circumstances, they find a more attractive coalition partner. 

This does not mean, however, lhat the dam-and-bridges legislators 
can never alter policy. Rather, it means that they must bid for seats on 
the committee and wait until they attain a majority. Small changes in 
political circumstances are not likely to make it worth the attempt. 
Therefore, the committee system implies that policy will respond only 
to large changes in political circumstances or to major shifts in the 
electorate. 17 

E. Controls over Committees 

Committees are decentralized decision-making units composed of 
those legislators with the greatest stake in their jurisdiction. Their 
power to decide what proposals (if any) are brought to the floor places 
them in an agency relation with the rest of the legislature. As with any 
form of delegation, this authority provides the potential for moral 
hazard. What prevents the committee from extracting tixt much sur¬ 
plus at the expense of other legislators? 

The committee system constrains the behavior of its subunits by 
restricting committee power. In particular, the majority rule condi¬ 
tion precludes any one committee from extracting too many gains at 
the expense of others. Suppose, for example, that one committee 
attempts to extract the entire budget. The majority rule requirement 
implies that this proposal must get a majority of legislators to give up 
the opportunity to spend some of the budget in their areas. They will 
do so only if the value of the last dollar from this proposal to them 
exceeds the value of the first dollar spent within their own jurisdic- 


We note that this phenomenon parallels vertical integration. There, long-term 
agreements also induce durability and rigidities: the contract is not renegotiated with 
each small change in economic circumstances (e.g., prices) ami therefore does not 
respond to changes in the way a spot market does. 
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tion. Since members value influence within their own jurisdictions, 
this situation is unlikely. Thus the voting rule plays an important 
constraining role over the opportunistic behavior of particular com¬ 
mittees. IS 

F. Summary 

Instead of trading votes, legislators in the committee system in¬ 
stitutionalize an exchange of influence over the relevant rights. In¬ 
stead of bidding for votes, legislators bid for seats on committees 
associated with rights to policy areas valuable for their reelection. In 
contrast to policy choice under a market for votes, legislative bargains 
institutionalized through the committee system are significantly less 
plagued by problems of ex post enforceability. 

V. Evidence: The Distribution of Preference, 

Influence, and the Benefits of Committees 

In what follows, we provide evidence showing that choices and deci¬ 
sion making in the U.S. Congress are consistent with our view. 11 ' 
(Thus this is not a direct test between our model and the vote-trading 
approach.) 

The major feature of our model is that exchange takes place via 
institutionalization through the committees. By far the strongest piece 
of evidence from the U.S. Congress in favor of our approach con¬ 
cerns the pattern of membership and benefit flows for the various 
committees (Fiorina 1981a). Members from farming districts domi¬ 
nate the agriculture committees and oversee programs that benefit 
farmers. Members from urban districts sit on banking, housing, and 
welfare committees that provide benefits to an incredible array of 
urban constituents. Members with large defense installations or in¬ 
dustries dominate Armed Services committees. In each case, mem¬ 
bers mold policies in their jurisdiction to their constituents’ advan¬ 
tage. 

; The model is based on a set of assertions about committee opera¬ 

tion: (a) the assignment process operates as a self-selection mecha¬ 
nism; (6) committees are not representative of the entire legislature 
^ but instead are composed of “preference outliers,” or those who value 

T 18 in most legislatures, the amendment process places additional constraints on dir j 

q, behavior of committees. For details of this process for the U.S. Congress, including how 

it qualifies this argument, see Shepsle and Weingast (1987). The problem of how ihi< 
body places constraints on committees has never received systematic treatment. j 
19 Congress, unlike the British Parliament, meets the condilionsset out in Sec. II. V l i 
briefly compare our findings for the American case with those of the British in Sec. VII-1 

4 ‘ 

% 
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the position most highly; and (c) most centrally, committee members 
receive the disproportionate share of the benefits from programs 
within their jurisdiction. Let us survey the empirical evidence sup¬ 
porting these propositions. 

A. Committee Assignments 

At the beginning of each new congress, there are a number of vacant 
committee seats in some 25 committees and there are incoming fresh¬ 
men without seats. 20 They are encouraged to request only a small 
number of possible positions. Then party leaders attempt to match 
individual assignments with their freshman requests. There is, how¬ 
ever, a potential problem here: What prevents the system from break¬ 
ing down because everyone requests seats on the best and most pow¬ 
erful committees? How does the bidding mechanism actually select 
those freshmen willing to bid the .most for particular committees? 

The mechanics of the assignment process are designed to work 
against breakdown. It turns out that there are certain committees 
(e.g.. Post Office) that no one wants. Those who fail to get one of their 
requested slots are generally put on one of these committees. Re¬ 
questing the most valuable slots, therefore, increases the probability 
of ending up with Post Office. Suppose each freshman may poten¬ 
tially request a particular substantive policy committee (e.g., Agricul¬ 
ture, Housing and Welfare, or Public Works) valuable for his district 
that he has a high probability of getting. Which ones will opt instead 
to request the more powerful committees? Since the latter option 
involves a lottery between the most valuable committee and one worth 
virtually nothing, only those freshmen who value it most highly in 
Comparison with the sure thing of getting on their policy committee 
[will bid for it. 21 This lottery implies that revealed preferences reflect 
true preferences and shows how the assignment mechanism succeeds 


20 The following description relies on Shepsle (1975, 1978). While he did not discuss 
the preference revelation aspects of the assignment process, it is clear that the process 
must rely on some means of inducing truthful requests. Since tew empirical contexts 
that make use of these mechanisms have been studied, his data remain an untapped 

E iurrc lor further study. In what follows, we ignore for simplicity returning members 
'ho wish to change committees. For details on how this works, sec Shepsle (1978). 

The following table reports the frequency distribution over the lengths oi request 
sts (i.e., how many committees each freshman requests). Three-quarters of all fresh- 
ten (87th-93d Congress) ask for three or fewer out of 25. The number of observa- 
ons is 231 (source: Shepsle 1978, p. 49). 


1 2 3 4 5 or More 

^reentage 23 16 36 15 10 
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TABLE I 

Freshman Assignment Success 


Proportion Receiving 



First 

Other 

No 


Congress 

Preference 

Preference 

Preference 

N 

87lli 

.474 

.368 

.159 

19 

88th 

.500 

.306 

.194 

36 

89th 

.591 

.254 

.155 

71 

90th 

.308 

.308 

.384 

13 

92d 

.750 

.144 

.106 

28 

93d 

.891 

.166 

.193 

26 

All 

.585 

.243 

.172 

193 

Source -Shepilc (197W, p )93>. 


in matching members with committees whose jurisdictions they value 
most highly. 

The evidence supporting this interpretation is twofold. First, table 
1 shows that the probability of a f reshman's gaining one of his top 
three is above .8. 22 Second, and more important, table 2 shows that 
when there is nu competition for a seat, the requester is virtually 
assured of getting his first choice (the probability is over .94); but the 
greater the competition, the less likely is a freshman to attain his first 
choice. There is also considerable evidence that freshman requests 
take into account competition for seats. 2 ' Competition of this sort 
appears necessary—though not sufficient—to ensure that bids reflect 
underlying preferences. 

Overall, then, the pattern of committee assignments looks remark¬ 
ably like an optimization process that maps members into those com¬ 
mittees they value the most. 


B. Committee Membership 

To be more systematic about committee membership, we have exam¬ 
ined indexes of member preferences over issues that correspond to 

22 Moreover, it is not dear that this frequency can be much higher because of the 
many accounting constraints (see Shepsie 1975) imposed on the problem (c.g., only one 
freshman per slot; each vacant slot must be filled). 

' a Shepsie (1978) provided one more piece of evidence for our model. Using probit 
analysis to predict which freshman requests particular committee slots, he estimated a 
set of simple demand equations. His results are consistent with our model, namely, that 
simple measures of constituency interest (e.g.. number of agricultural workers, military 
employees, or housing) are good predictors of requests. Moreover, these estimates also 
show that freshmen rationally anticipate competition for different seats: when other 
factors are held constant, the estimated probability of a freshman's requesting a certain 
seat goes down as the number of competitors increases. 
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TABLE 2 

Effects of Competition on Assignments 


First Preference 
Assignment Success 


Total Number of Effective 
Requests per Vacancy 

Less than 1 1-2 More than 2 


Yes 94.4 67.2 30.5 

No 5-6 32.8 69.5 


Soi'Rf t She|*le (197S, p. 201). 


major committee jurisdictions. This exercise reveals that members of 
the relevant committee or subcommittee significantly differ from the 
rest of the House. 24 Most indexes are computed by an interest group 
with a clear stake in the policy area being considered. Because they 
are constructed so as to indicate which congressmen are supporters of 
the group, these indexes are good proxies for supporters of the 
group’s interests. The scores computed by the AFL-CIO Committee 
on Political Education (COPE), for example, indicate pro- and anti¬ 
labor congressmen; the American Security Council’s National Secu- 
riiy Index (NSI) reveals supporters of a strong national defense and, 
apparently, opponents of foreign aid. 25 

The model predicts that representatives of particular interests gain 
policy benefits through membership on relevant committees. Hence 
we should observe that committees are composed of members who 
are significantly above-average supporters of the relevant interest 
group and, in particular, have interest group scores significantly 
above the mean for the entire Congress. 

This pattern is borne out by the results reported in table 3. The 
difference in preferences between committee members and the rest 
of the House is highly statistically significant. For a diversity of policy 
areas—defense, foreign aid, consumer protection, labor, and the 
environment—committee members are indeed significantly above- 
average supporters of benefits to the relevant interest group. 

Putting this evidence together with results from committee assign¬ 
ments reveals that legislators opt for committees relevant to their 
constituents’ interests and that their doing so leads to committees 

1 Though this would seem to be an obvious topic for political scientists, they have 
never systematically collected this type of data. Instead the literature typically provides 
anecdotal evidence, the best of which can lie found, e.g„ in joncs (1962) or Fcnno 
(1973). J 

foreign aid to other nations, under the jurisdiction of the Foreign Relations Com¬ 
mittee, seems to be a (political) substitute for military spending programs. The evidence 
suggests that those congressmen who support this aid tend to be against defense spend¬ 
ing, and vice versa. 
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TABLE 3 


Committee Members Are Preference Outliers 
Relative to the Full House (1978) 



Full House 

Committee 




Mean' 

Mean 

A* 

/-Statistic 

1. Armed Services: NSI 

2. International Relations: 

59.1 

76.8 

38 

17.87** 

NSI 

6! 7 

50.2 

37 

11.42** 

ADA' 

/-test lor mean NSI difier- 

37.5 

46.5 

37 

10.23** 

cnee between Armed Ser¬ 
vices and International Re¬ 
lations 




19.40** 

3. International Relations: Intel- 





national Economic Policy 
and Trade Subcommittee: 





NSI 

60.8 

51.3 

7 

4.24** 

ADA 

4. Interstate Commerce: 

38.1 

45.0 

7 

3.50** 

Consumer Protection and 
Finance Subcommittee: 

ADA 

37.9 

55.5 

8 

9.57** 

5. F.dncation and Labor: 





Economic Opportunity 





Subcommittee: COPE 

50.4 

60.0 

4 

3.33** 

6. Environmental sub¬ 
committees: I.CV M 

46.7 

58.3 

28 

2.08* 


a AH non-committee tncmixm. 

b Commuter or uihumi miner %>ic 

‘ Vote rating o( the American* for Denumatw Action. 

4 Include two of the majoi subcommittees with oversight icsponsihilitv for the KmironmcnUil Ptotntioii 
Agency, the Subcommittee on Eneigy and die F.nvmmmcnt (Interior Onntmiiec), and Sutx-nnimiitrc on Health 
and the Environment (Commerer (awnmtner;. LCV b the League ol Conservation Voter scores for 1977. 

* Significant at the .05 level 
•* Significant at the 01 level. 


composed of legislators with considerably higher support for policies 
within their jurisdiction. This pattern is precisely that expected by the 
view that committees institutionalize trades over influence so as to 
give their members greater control over policies with their jurisdic¬ 
tion. 


C. Committee Policy Benefits 

Do committee members receive a disproportionate share of the bene¬ 
fits from their committees? The evidence on preferences provides 
indirect support for this since committees disproportionately attract 
representatives seeking to provide their constituents with benefits. 
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Here we summarize some direct evidence in favor of this proposi¬ 
tion. 26 

1. Ferejohn (1974a) in his now-classic study on the pork barrel 
tested a variety of hypotheses about committees. He showed that the 
number of new projects started in each state is a function of commit¬ 
tee membership. His estimations imply, for example, that each mem¬ 
ber on the Public Works Committee yields an additional 0.63 new 
projects for his state. Further, each 10 years of service by representa¬ 
tives from a state yields approximately an additional project. Similar 
results are obtained regarding more than two dozen related hypoth¬ 
eses. 

2. Arnold (1979) studied three areas (military base closings, water 
and sewage grants, and model cities grants) and provides results simi¬ 
lar to Ferejohn’s about the pattern of benefits. 27 His contingency 
tables provide unambiguous evidence; we reproduce two. 

Table 4, part A, shows the frequency of acceptance of an applica¬ 
tion for a water and sewage grant, depending on a congressman’s 
position in the committee system: is he a member of the relevant 
appropriations subcommittee? the relevant authorization committee 
(Banking and Currency)? of neither? The table shows that members 
of the relevant committees systematically fare better than nonmem¬ 
bers. Those on neither committee have a probability of acceptance of 
.176. In contrast, members of the Appropriations Subcommittee have 
a probability of acceptance of .313 (80 percent larger), and members 
of the authorizing committee have a probability of acceptance of .281 
(60 percent larger). The differences are significant at the .001 level. 
Part B of the table shows that the same pattern holds for model cities 
project selection. For these projects, congressmen who are on neither 
relevant committee have a probability of selection of .29. The proba¬ 
bility of acceptance for members of the Banking and Currency Sub¬ 
committee, .62, is more than double that for nonmembers; the proba¬ 
bility for members of the Appropriations Subcommittee, .86, is nearly 
triple. 

3. Several recent studies by economists used similar methodologies 
and yielded similar evidence, Malone (1982), studying defense expen- 

*’ Unfortunately, by far the biggest effort tu support this proposition in the political 
science literature comprises anecdotal or descriptive material rather than systematic 
data analysis. While this literature supports out proposition, it is no substitute for 
systematic empirical investigation. 

We do not reproduce his probit estimates here (nor discuss bis concerns about 
whether congressmen manipulate bureaucrats or bureaucrats manipulate congress¬ 
men). These estimates suffer from significant econometric problems and are therefore 
of questionable value. Simultaneity, much like that found in estimating supply and 
demand equations, plagues his design. 
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TABLE 4 

Frequency of Aoccptance of Applications 


Application 

Represented 

Applications 

Accepted 

Not 

Accepted 

Total 

Derisions 

Probability 
of Acceptance 


A. Water and Sewage Cram Selection (1970) 

Subcommittee of 
Appropriations 
Committee 

21 

46 

67 

.313 

Banking and Currency 
Committee 

27 

69 

% 

.281 

Neither committee 

281 

1,223 

1.484 

.176 

Total 

309 

1,338 

1,647 



8. Model Cities Project Selection 


Subcommittee <if 



Appropriations 

Committer 6 

i 

7 .86 

Banking and Currency 



Committee 5 

3 

8 .62 

Neither committee 38 

78 

116 .29 

Total 4!) 

82 

131 

SoracF.—Arnold 11979. pp. 139, IHOi 

Nrirr —For pt. .V *' 13.80 and significance level is 

tail. For pt. B. X 1 

« 10.81 and jigirilttiimc level is 01 


ditures, showed that members of the Armed Services commiitees re¬ 
ceive a statistically significant greater share of federal expenditures in 
this category, though Rundquist (1973) could find none. Faith, 
Leavens, and Tollison (1982) studied the geographic location of firms 
that are the target of antitrust suits brought by the Federal Trade 
Commission (FTC). They showed that firms located in districts rep¬ 
resented on the FTC oversight subcommittees were systematically 
underrepresented in the set of suits brought by the commission. Co¬ 
hen and Noll (1986), using an innovative methodology, derived simi¬ 
lar results for federal R & D projects. 

4. Weingast and Moran (1983) studied the influence of Congress on 
the distribution of cases chosen by the FTC under the various statutes 
it administers. They found, for the Senate, that all members possess 
some influence but that members of the relevant subcommittee pos¬ 
sess more influence and that the subcommittee chairman possesses 
even more influence (see table 5). According to their estimates for 
textile cases (under the Fur, Wool, and Textile Labeling acts), a mem¬ 
ber of the subcommittee had nearly three times the eff ect of a non¬ 
member while the chairman had 12 times the effect of a nonmember. 
Their results reveal a similar pattern for the other case types studied 
(credit cases, Robinson-Patman cases, and merger cases). 

5. The pattern of campaign donations by firms provides additional 
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TABLE 5 

Change in the Probability c»f Opemnc a Textile Case When a Senator's 
ADA Score Increases 10 Points 


Senator's Position 

Change in Probability 

Not on the subcommittee 

.005 

On the subcommittee but not chairman 

.0)3 

Subcommittee chairman 

.060 


Soiirci. —Wringaal anri Moran (1983) 


evidence. A firm’s decision to donate money to a congressional cam¬ 
paign must pass the same test as any other investment made by the 
firm; namely, the expected value of the return must exceed the dol¬ 
lars invested. When deciding among politicians, firms must focus on 
those congressmen with a marginal impact on their future profitabil¬ 
ity. If committee members have a disproportionate influence over 
policy choice in their area, then they should attract a disproportionate 
share of campaign contributions from firms affected by the commit¬ 
tee’s policy jurisdiction. 

This prediction is clearly borne out in Munger’s (1984) study. He 
estimated a probit model of the probability that a certain legislator 
receives a donation from a given firm. He showed that political action 
committees are systematically more likely to donate to members of 
committees that affect their firms: the probability that a committee 
member will receive a donation is .34 higher than that of a non- 
member. 


VI. Comparative Statics: Predictions 
and Evidence 

In a simple market for votes, a small change in the relative composi¬ 
tion of interest groups leads to a small change in the demand for 
votes. 1 his, in turn, leads to a small change in the equilibrium pattern 
of exchange and hence in the distribution of policy costs and benefits. 
However, our argument about the demand for durable policies and 
he evolution of institutions to provide them implies that policies are 
•artially insulated from small changes in member preferences. Be- 
ause committees retain a veto over policy change, we must look to 
iow these changes affect committee members. If the change in inter- 
st groups affects only legislators who are not members of the com- 
nittee, then policy change is significantly less likely. But our model 
Iso leads to an important comparative statics prediction; a sufficient 
ondition for policy change is that there is a substantial turnover in 
ommittee membership so that the new holders of committee prop- 
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erty rights have preferences that differ from those of their predeces¬ 
sors (see Weingast 1981; Weingast and Moran 1983). 

While comparative statics results are a primary tool of prediction 
and testing in economics, few studies of political economy have used 
this approach to test theories of politics. Nonetheless, there exists 
some evidence on the prediction noted above in the empirical litera¬ 
ture. We cite these studies and then suggest further tests. 

A. Appropriations 

Ferejohn (1974a) again plays an important role here. During the 
1950s and early 1960s, fiscal conservatives dominated the congres¬ 
sional appropriations process. Further, during this period, committee 
leaders had nearly absolute power of assignment of members to sub¬ 
committees. One way of enforcing fiscal restraint was to assign mem¬ 
bers of the Appropriations Committee to a sultcommittee only if they 
had no stake in the subcommittee's jurisdiction. By the mid-1960s, 
however, this rule had gone by the wayside so that subcommittees 
came to be composed of members with a high stake in their jurisdic¬ 
tion. Ferejohn showed that, for the Public Works Subcommittee, this 
led to a statistically significant increase in appropriations. 

H. Regulatory Agencies 

A host of recent studies of regulatory agencies has shown that com¬ 
mittee members have substantial influence over agencies within their 
jurisdiction (Barke and Riker f 1982] on the Interstate Commerce 
Commission, Grier [1984] on the Fed, Moe [1985] on the National 
Labor Relations Board, and Weingast and Moran [1983] on the FTC). 
In nearly all cases, these statistical studies showed that, as committee 
preferences change, so too does agency policy. Large swings in com¬ 
mittee preferences lead to large swings in policy. 

Weingast and Moran (1983), for example, studied the recent policy 
change at the FTC. In 1979 and 1980, the commission’s aggressive 
consumer activist policies were halted by Congress. While this action 
was hailed as Congress’s finally catching a runaway, out-of-control 
bureaucracy, Weingast and Moran showed that nothing of the sort 
happened. Instead, the FrC had been under the influence of the 
relevant subcommittee all along. From the late 1960s through the mid 
to late 1970s, this subcommittee both favored and fostered aggressive 
consumerist policies. However, following the 1976 election, a nearly 
complete turnover in membership brought to power members with 
substantially different preferences. Weingast and Moran interpreted 
the 1979—80 episode as the new committee's simply reversing the 
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policies of their predecessors rather than catching an uncontrollable 
bureaucracy. Their statistical tests support this interpretation. 


VII. Discussion 

Representatives of different constituencies have considerable incen¬ 
tives to exchange support so as to provide benefits to their supporters. 
Because the value of today’s legislative bargains depends on actions 
taken in future legislative sessions, legislators also have incentives to 
devise institutions that provide today’s bargains with durability. As in 
all exchange settings, the institutions that evolve to support the ex¬ 
change reflect the specific pattern of transaction costs underlying the 
potential trades. For legislatures these include the possibility of con¬ 
tingencies too numerous (or costly) to specify in advance and private 
information. This gives rise to a host of institutions underpinning a 
set of property rights loosely referred to as the committee system. We 
showed that these institutions lower the risk of ex post opportunistic 
behavior that would plague explicit exchanges of votes. The legisla¬ 
tive institutions therefore lower the agency costs associated with ex¬ 
change. 

In addition we showed why this set of institutions is superior to a 
market exchange mechanism. Instead of trading votes, legislators ex¬ 
change special rights affording the holder of these rights additional 
influence over well-defined policy jurisdictions. This influence stems 
from the property rights established over the agenda mechanisms, 
that is, the means by which alternatives arise for voles. The extra 
influence over particular policies institutionalizes a specific pattern of 
trades. When the holders of seats on committees are precisely those 
individuals who would bid for votes on these issues in a market for 
votes, policy choice under the committee system parallels that under a 
more explicit exchange system. Because the exchange is institutional¬ 
ized, it need not be renegotiated each new legislative session, and it is 
subject to fewer enforcement problems. 

The committee system also influences coalition formation. Commit¬ 
tee agenda power implies that successful coalitions in the area of the 
committee’s jurisdiction must include the committee. This rules out, 
for example, policies that benefit solely a coalition of members off this 
committee, and this holds even if this coalition contains a majority of 
the entire legislature. Unless a coalition of non-committee members 
is prepared to include or “buy out" the committee, veto power allows 
the committee to block access of this coalition to the floor. 

We also showed that policy bargains, and hence coalitions, are more 
durable under the committee system. Thus the decision to enter into 
such an agreement is much, like entering a long-term contract, and 
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legislators will take this into account. This implies that coalitions will 
not always respond to small changes in political circumstances as they 
would under a spotlike market exchange system. Rather they tend to 
respond only to large shifts or major political realignments. Commit¬ 
tee veto power combines with the property right system over seats to 
play an important role in maintaining a political coalition—and a 
particular policy—for long periods. Policy in a particular area may 
remain stable if committee membership is relatively stable, and this 
can hold even with major changes in the preferences of members off 
the committee. The ability to veto the proposals of others is a subtle 
yet powerful tool used by committees to influence policy in their 
jurisdiction (Weingast and Moran 1983; Shepsle and Weingast 1987). 

This argument raises some interesting parallels and contrasts with 
those provided for vertical integration in market settings. In both 
cases, institutions are designed to prevent similar forms of incentive 
problems, for example, ex post opportunism. However, it appears 
that the source of these problems differs. For the case of vertical 
integration, it is relation-specific assets. For the legislature, however, 
incentive problems arise because there is no underlying medium of 
exchange so that trading votes requires future reliance and hence the 
opportunities for reneging (see n. 10). Moreover, as Ferejohn (1974*) 
has shown, it is not dear whether one can exist, given the peculiar 
externalities associated with vote trading. 

We have pursued in this paper only one explanation for enforcing 
trades. It is useful, therefore, to discuss a number of potential alterna¬ 
tives, though a full-scale empirical investigation is beyond the scope of 
this paper. The first alternative is that ex post opportunism either is 
negligible or is handled in some other way, thereby allowing exchange 
to take place through trading. According to this view, the existence of 
committees is epiphenomenal, perhaps representing some formal 
(though unimportant) recognition of those legislators who have in 
fact “bought” influence over particular issues. An empirical test be¬ 
tween this explanation and our model might focus on the respon¬ 
siveness of policy choice to members of the committee. In an explicit 
exchange setting, large changes in the preferences of members off 
the committee should lead to changes in policy. Under the legislative 
committee model, committee veto rights imply that policy is more 
insulated from changes of this type, and hence we should observe 
policies to be less responsive. 

A second competing explanation is perhaps more interesting. Par¬ 
ties, ruled out by assumption in our model, offer an obvious alterna¬ 
tive for institutionalizing and enforcing trades. The historical evi¬ 
dence for the U.S. Congress suggests that strong parties and strong 
committees, as institutional underpinnings of legislative exchange, 
are substitutes. When parties were more powerful (e.g., at the turn of 
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the century), committees, though important, did not have such clear- 
cut rights as in modem times. Seniority, For example, was regularly 
violated by party leadership in allocating the leadership positions 
within committees. Importantly, virtually every institutional change 
during this century that has made committee rights stronger has 
come at the expense of parties and centralized leadership. 

This suggests a natural extension of our approach to the case of 
party government (which includes the British Parliament in addition 
to the House of Representatives of the past). Strong parties are char¬ 
acterized by control over important resources such as entry into the 
competition for individual seats and the positions of power within the 
legislature (e.g., the ministerial positions in Britain), and they wield 
considerable influence over the distribution of legislative (read: elec- 
torally useful) benefits. Parties, like firms, can build types of reputa¬ 
tions different from those of the individuals who make them up (see, 
e.g., Kreps 1984). To the extent that they are able to influence the 
behavior of their members through distribution of resources, parties 
potentially provide an alternative means of enforcing agreements. We 
hope to extend our approach in the future to yield results about the 
institutions underpinning legislative exchange in this context.An 
important issue of this research concerns the circumstances favoring 
the survival of one mechanism over the other. 

One limitation of our analysis is that, while we argue that legislative 
rules mitigate certain contractual problems, we do not explain how 
the rules themselves survive. Since majorities may alter the rules, what 
prevents the breakdown of cooperation that takes on a slightly differ¬ 
ent form? In circumstances in which reneging, say. would occur with¬ 
out rules, what prevents individuals from first voting to change the 
rules and then reneging? An extensive investigation of this issue is 
beyond the scope of this paper. However, there appear a variety 
of ciicumstances under which the rules will survive a breakdown 
whereas cooperation without rules would not. For example, if many 
diflerent policy jurisdictions are governed by the same set of rules, 
then a single set of rules may link behavior in one area with that in 
another. Hence incentives to renege in one area do not automatically 
result in corresponding incentives to change rules that govern many 
areas. 25 ’ Since it clearly touches on issues that hold for a large variety 
of organizations, this question is worthy of a separate investigation. 


f° r an interesting beginning on this problem, sec Lcibowitz and Tollison (1980). 
As a second set of circumstances, we single out the notion of leadership explored 
by (divert (1986) in his extension of the Kreps and Wilson (1982) model to legislatures. 
Calvert studied circumstances in which a particular individual is given resources by 
other individuals. With these resources, he then, e.g., polices the behavior of his lollow- 
et s. In principle, this mechanism might be used to prevent the breakdown of coopera¬ 
tion in certain circumstances and therefore be valuable cx ante to members. 
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The empirical evidence supports four implications that follow from 
our model of legislative institutions but do not follow from a simple 
market exchange mechanism. First, committees are composed of 
“high demanders," that is, individuals with greater than average in¬ 
terest in the committee's policy jurisdiction. Second, the committee 
assignment mechanism operates as a bidding mechanism that assigns 
individuals to those committees they value most highly. Third, com¬ 
mittee members gain a disproportionate share of the benefits from 
their policy area. This appears to hold across widely differing policy 
jurisdictions. Fourth, there exists important evidence supporting a 
comparative statics prediction of the model, namely, that as the inter¬ 
ests represented on the committee change, so too will policy, with the 
interests of non-committee members held constant. Evidence sup¬ 
porting this proposition exists in several regulatory areas; future tests 
will reveal the robustness of the results. 

In sum, the institutions of Congress appear remarkably suited to 
legislators' reelection goals. Their specific form appears to have 
evolved to reduce problems that also arise in market exchange, 
namely, problems of measurement, moral hazard, and opportunism. 
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On the Optimal Pricing Policy of a Monopolist 


Charles A. Wilson 
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The paper presents a simple explanation of price dispersion by a 
monopolist assuming only that consumers arrive in a random order 
and are served on a first-come-first-served basis. A firm can some¬ 
times increase its profits by charging two different prices for the 
same good and rationing sales at the lower price. However, it is 
never necessary to charge more than two prices, and a single price is 
sufficient as long as either the marginal revenue curve is everywhere 
downward sloping or the marginal cost of production is constant. 


I. Introduction 

When will a monopolist charge the same price tor every unit that it 
sells? It is well known that a monopolist may have an incentive to 
charge different prices to different consumers if it can identify the 
demand curve of the different consumers and prohibit arbitrage. 
Even if individual consumers cannot be identified, it is sometimes 
possible to price-discriminate by offering quantity discounts. If the 
cost of search is explicitly modeled, Salop (1977) has shown that a 
monopolist can sometimes benefit from randomizing over its prices 
across outlets (or presumably across time) in order to exploit differ¬ 
ences in the search costs of agents with different reservation values. 
In this note, I examine another explanation of nonuniform pricing by 
a monopolist. 
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Suppose that a large but fixed number of consumers arrive in ran¬ 
dom order to buy a homogeneous good. Before any consumers ar¬ 
rive, the firm puts a separate price tag on each unit for sale. On 
arriving, each consumer then purchases at the lowest available price 
up to the point at which the price of the next unit exceeds his reserva¬ 
tion value. Unless the firm has a constant (or decreasing) marginal 
cost of production, it may have an incentive to charge different prices 
for different units of the good. 

For simplicity, suppose that the firm has 300 units to sell. There are 
100 consumers. Each consumer is willing to purchase a single unit at a 
price of 2 and four additional units at a price of 1. Clearly, if the firm 
must charge the same price for all the units, it will charge a price of 1, 
resulting in a total revenue of 300. However, if we permit the firm to 
charge a different price for different units, it could increase its reve¬ 
nue by selling the first 250 units at price 1 and the last 50 units at price 
2. The first 50 consumers purchase the 250 units at price 1. The next 
50 purchase one unit each at price 2. The revenue of the firm is 
consequently increased from 300 to 350. 

The next three sections contain a characterization of the optimal 
pricing policy for any left-continuous, downward-sloping demand 
curve generated by a large number of consumers. For a fixed level of 
output, the optimal pricing policy can be found as the solution to a 
linear programming problem in the space of nonnegative measures 
on the space of prices. From this, a number of properties of the 
solution can be established. First, the firm need never charge more 
than two prices to maximize its revenue. Second, in contrast to the 
case in which the monopolist is constrained to using a single price, the 
marginal revenue will always be a nonincreasing function of the quan¬ 
tity. In fact, it is precisely in those cases in which the single-price 
marginal revenue curve is strictly increasing over some range that the 
firm has an incentive to charge more than two prices. Finally, at any 
level of output at which the firm charges more than two prices, the 
marginal revenue is constant. Consequently, if the firm can produce 
output at a constant marginal cost, then its revenue can be maximized 
by choosing a level of output for which a single price is optimal. 

Notice that the example above contains ait these features. First, 
because of the discontinuity in demand, the (single-price) revenue is 
not a concave function of quantity. Second, because of the capacity- 
constraint, marginal cost is not constant. Finally, the optimum is at¬ 
tained using only two prices. 

In Section Vll, I discuss the welfare implications of using a two- 
price policy, and in Section VIII, I conclude with a brief discussion of 
some applications. 

Although I conduct the analysis in the context of a static model, 
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similar results can be established in a dynamic model in which the flow 
demand is stationary and the cost function refers to the flow of pro¬ 
duction. In this case, a firm with an upward-sloping marginal cost 
curve might find it optimal to randomize from day to day (or week to 
week) over two different prices in order to achieve a flow demand 
equal to (on average) its flow supply. In any case, the argument de¬ 
pends critically on an informal appeal to the law of large numbers to 
identify the realization of the effective demand curve with its average. 
If we explicitly account for discreteness in demand, we are con¬ 
fronted with a complicated integer programming problem. For this 
problem, V. J. Krishna and Motty Perry (in private correspondence) 
have shown that the firm will charge a single price when marginal cost 
is zero; otherwise, very little seems to be known about the optimal 
pricing policy. 

II. The Model 

A large number of consumers visit a firm in random order to pur¬ 
chase a homogeneous good. Before any consumers atrive, the firm 
must choose the quantity to be offered for sale and the price of each 
individual unit. When it is his turn to purchase, a consumer purchases 
from the available units up to the point at which the price of the next 
unit exceeds the price he is willing to pay. Under the assumption that 
goods are perfectly divisible, a pricing policy is then a measure Q on 
the set of prices (0, <»), where, for any subset of prices P, Q(P) denotes 
the number of units lor sale at those prices. Suppose that t he demand 
curve of any positive measure of consumers is proportional to the 
aggregate demand curve D{ ). 

To simplify the mathematics, initially suppose that £)(•) is a left- 
continuous, nonincreasing step function on a finite set of prices {p», 

... ,p n } such that 0 = p 0 <p, <...</.„ with D(p ) = 0 for p > p„. 
Since the firm has no incentive to charge any price outside of the set 

{pit,... ,p„), a pricing policy Q can be represented as a vector (qn, - 

q„), where q, = Q(pi) is the number of units for sale at price p,. 

We will proceed as follows. First, we will determine the set of pric¬ 
ing policies that are consistent with the sale of q units of the good. 
Then we will determine which pricing policies from this set maximize 
the profits of the firm. Finally, we will examine how the pricing policy 
and revenue change with the number of units sold. 

III. The Optimal Pricing Policy of a Firm Selling 
a Fixed Quantity 

Since there is no aggregate uncertainty, we may suppose that the firm 
plans to sell every good to which it attaches a (finite) price. Since 
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consumers always purchase at the lowest available price, we may also 
assume that the goods are sold in order of their price. For notational 
ease, let D, = D(pi) and let £, be the measure of excess demand at 
price p, after the goods offered at prices p 0 to p, have been sold. Then, 
since the units priced at p» are sold first, it follows that 

£0 = Do ~ qo = A.[l - 

suppose that £ 0 2 0. Then the excess demand for goods priced at P\ is 
:qual to the excess demand at price /)« times the fraction of those 
ronsumers who stay in the market when the price rises from po to p\ 
minus the supply of goods offered at price p\. The excess demand at 
price po is just E 0 . The fraction of consumers who stay in the market is 
ihe ratio of the number of consumers with reservation value greater 
than or equal topi to the number of consumers with reservation value 
greater than or equal to po, (£)|//)«). The supply of goods offered at 
price p] is just < 71 . Consequently, 



Proceeding by induction, we may conclude that as long as E, _ 1 a 0, 
then 



11 all the units offered are to be sold, then E„ must be nonnegative. 
Consequently, any pricing policy must satisfy 



If the total sales of the firm are q units, then 

n 

X (-> 

i-O 

For the remainder of the paper, we will restrict attention to q E [0, 
D(0)]. The problem of the firm wishing to sell q units is therefore the 
following. 

It D„ 0 », then the demand at any higher price is unaff ected by the supply of goods 
*t price p u = 0. In this case, we could simply start by considering the excess demand ai 
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Problem 1. Choose a pricing policy Q = (q n . q n ) a 0 to 

maximize p,qi subject to constraints (1) and (2). 

Problem 1 is a linear programming problem in (q Q , ..., q„) with two 
constraints. In general, a maximum of a linear programming prob¬ 
lem can always be attained at a vector for which the number of posi¬ 
tive elements is less than or equal to the number of constraints (see, 
e.g., Gale I960, p. 84). 2 Consequently, the revenue-maximizing mo¬ 
nopolist need never charge more than two prices. 

We may also establish this result directly. Appealing to the Kuhn- 
T'ucker conditions (see, e.g., Dixit 1976, p. 63), a pricing policy (q {) , 
. . ., q tl ) i 0 is a solution to problem 1 if and only if relations (1) and 
(2) are satisfied and there is a X 6 (-», °o) and ajt^O such that 



These conditions imply the following lemma. 

Lemma 1. (a) Constraint (1) is satisfied with equality if and only if q 
2: D„. ( b) If q < D„, then q, = 0 for i < n. 

Proof. If q < D n , then, since D is nonincreasing, constraint (2) im¬ 
plies 2".u (q,/D,) ^ (l/D„) 2"=u q, - q/D„ < 1. Conversely, if S"„ ( > 
(qJD,) < 1, then equation (4) implies p = 0. It then follows from 
relation (3) that \^p„> pi for i < n and hence, front relation (3), that 
q, = 0 for i < n. Therefore, q ~ q n < D„. Q.E.D. 

Proposition 1. For a fixed level of sales, a monopolist cart always 
achieve its maximum revenue by charging no more than two prices 
for different units. 

Proof. Let X and p 2 : 0 satisfy relations (1)—(4) for some vector ( 90 , 
..., 9 „) > 0. Let l = min{?: q , > 0}, let A = max{i: q, > 0}, and suppose / 
< A. Then it follows from lemma 1 that constraint (1) holds with 
equality. Since D is nonincreasing, it then follows from equation (2) 
that (q!D/,) < 1 < {qlDi). Consequently, we may choose qi 2 : 0 and fa 
a 0 such that fa + fa = q and (fa/Dj) + (falD *) = 1. Let 9 , = 0 for 1 / /, 


2 Even more generally, the maximum of a convex function over a convex set can 
always be achieved at an extreme point (Rockafeilar 1970, p. 348). It is easy to show that 
any extreme point of a set defined by two linear functions and nonnegativity constraints 
can be positive at, at most, two components. This fan was first pointed out to me by 
Stan Reiter (see Chemoff and Reiter 1954). 
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A. Then it is easy to check that X and p also satisfy relations (3) and (4) 

for (fi t . 4 ,,). Consequently, (^i.$„) is a solution to problem 1. 

Q.E.D. 


IV. Properties of the Revenue Function 

We will examine next how the optimal policy and the revenue change 
with an increase in the level of sales. The argument in the previous 
section depends in no essential way on the assumption that the de¬ 
mand curve was a step function.’* For the remainder of the paper, 
therefore, we will take D(-) to be any nonincreasing, nonnegative, 
right-continuous function defined for all positive prices with the 
property that D(p) > 0 for some p > 0. To ensure the existence of a 
solution, we will also assume that lim pX *pD(p) = 0. 

For any q E [0, D(0)], let R(q) denote the maximum revenue that 
can be obtained from selling q or fewer units. We will refer to R() as 
the “revenue function.” Let p - sup {p: D(p) > 0} and D = D(p). 
Notice that p plays the same role as p n in Section III so that lemma 1 is 
still valid if we replace p„ with p. 

In the light of proposition 1, we may restrict attention to pricing 
policies that concentrate all sales at two prices. Let ((p h q t ), (p h , q h )) 
denote such a pricing policy that offers quantities q t > 0 and q h > 0 at 
prices pi and p h , respectively. We define Dh = D(ph) and D t = D(p/) 
and will adopt the convention that if pi ¥= ph, then pi < p h ■ As we shall 
see in the proof of proposition 2, relation (3) implies that D t < D h 
whenever p , < p fr 

We will use the following implication of lemma 1 and proposition 1. 

Lemma 2. Suppose that ((pi, qi), (p h , $,)) is an optimal policy for q 
£ D. Then for any q E [D,„ D t ], 



(5) 


If the price spate is [0, x), then relations (1) and (2) may be expressed as 

JQW*>)=1 <r> 

and 

nvmp)](i(dp)s 1 . ( 2'i 

then to choose a nonnegative measure Q to maximize 
•>P<( p) subject to constraints (1') and (2'). This is still an (infinite-dimensional) linear 
programming problem. Since the constraints also satisfy the necessary constraint 
t|ua lhcations (sec, e.g., Dixit 1976, p. 54), the infinite-dimensional analogues to rela¬ 
tions (l)-<4) are still necessary and sufficient conditions for a solution. 



17<> JOURNAL OF POLITICAL ECONOMY 

and R(q) can be attained with policy ((p t , qi), (p h , ^)), where qi and q h 
are defined by 


D, 


% 

D 


( 6 ) 


and 


?/ + ?* = <!■ (V) 

Proof. Suppose that, for some q > D, policy ({pi, <j t ), (p k , q h )) attains 
R(q). Then there are a A £ (- ®, “) and a p > 0 that satisfy relations 
(1)—(4). Now consider any q £ [D/„ D/]. Then equations (6) and (7) 
have a unique nonnegative solution (qi, 9 *). Furthermore, together 
with A, p, and q, pricing policy ((pi, q t ), (ph, qn)) also satisfies relations 
(l)-(4). Therefore, R(q) = piq t + Solving equations ( 6 ) and (7) 
for qi and <//, as functions of q then yields equation (5). Q.E.D. 

Note that equations ( 6 ) and (7) imply that, when the firm charges 
more than one price, an increase in total sales requires that it increase 
its sales of the lower-priced good and decrease its sales of the higher- 
priced good . 4 

Using lemma 2, we may establish the following proposition. 

Proposition 2. (a) The revenue function is concave, (b) If it is 
optimal for the firm to charge two prices to sell quantity q, then 
marginal revenue is constant over some interval around q. 

Proof. Part a follows from the fact that problem 1 is a concave 
programming problem (see, e.g., Dixit 1976, chap. 3). 

To prove part h, suppose that, for some q £ (0, Z>(0)] and p ( < p h , 
pricing policy ((pi, qi), (ph, 7 *)) attains/?^). Then lemma 2 implies that, 
for any q £ f/J A , D{\, R (q) is defined by equation (5). Since this function 
is linear in q, part b will be proved if we can show that Dh < Q < Th- 

Relation (3) implies that there is a A £ (- *>,») and a p > 0 such that 
Di(pi - A) * D h (ph - X) = p a 0. This implies that D, > D h . It then 
follows from equations (5) and (6) that q/Dj = (q t + < 1 <(qi + 

qh)/D h - qIDh and hence that D h < q < Dt. Q.E.D. 


V. The Optimal Choice of Quantity and the 
Conditions for a Single Price 

When the monopolist is constrained to charge a single price, the 
revenue function need not be concave. In this section, we demon¬ 
strate that it is precisely when the single-price revenue function is not 

* There is a dose analogy here 10 the Rybczynslti line in the standard two by two 
sector general equilibrium model (see, e.g,, Jones 1965). 
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concave that the firm has an incentive to charge more than one price 
at some output levels. 

Let D ~ 1 ( q) be the highest price such that D{p) - q and let R s (q) be the 
revenue to the firm when it is constrained to sell q units at the single 
price D ~ 1 (q). 

Lemma 3. At any quantity q G {0, D( 0)], there are pi, pi, G [0, °°) and 
an a G [0, 1] such that 

R(q) = a R S (D,) + (1 - a)R,(D h ). (8) 

Proof. If q ^ D, let p, = p, p h > p,_and a = jID. Then lemma 1 
implies that/? (q) = pq' = apD - aR S (D) = aR,(D) + (1 - <*)/?,(0). If 
q > D(p), then equation (5) of lemma 2 implies that, for some pair 
of prices p,, p h , there is an a G [0, 1] such that R(q) = ap,D, 4- (1 
- a)p,,D h = aR,(D,) + (1 - a)R,(D h ). In either case we obtain equa¬ 
tion ( 8 ). Q.E.D. 

Proposition 3. The firm can maximize its revenue at all output 
levels using a single price if and only if the single-price revenue func¬ 
tion is concave. 

Proof. Let // = {(r, q) G [0, ») 2 : r < R(q)) and let co // denote the 
convex hull of H. Similarly, let //, = {(r, q) G [0, ») 2 : r s R,(q)} and let 
co II, denote its convex hull. Since the firm always has the option of 
using a single price, it follows immediately that R(q) 2 R,(q ) for all q 
2 0. Therefore, co If C co II. Furthermore, the concavity of R( ) 
implies that co H - H and hence that co//, C H. But lemma 3 implies 
that // C co //,. Therefore, // = co //,. 

To complete the proof, note that H, = co //, and hence R (q) - R,(q) 
for all q if and only if R t (-) is a concave function. Q.E.D. 

The profit-maximizing quantity depends on both the revenue and 
the cost functions. If the single-price marginal revenue function is not 
concave, it is always possible to choose a cost function so that a single¬ 
price policy is not profit maximizing. However, since lemma 2 implies 
that marginal revenue is constant at such output levels, it follows that, 
if marginal cost is constant or decreasing, a single price is always suffi¬ 
cient. 

Proposition 4 . If the cost function is concave, then the firm can 
always maximize its profit by charging a single price. 

Proof. Let CO) be the cost function and suppose profit is maximized 
at output level Let q* = supfy: R(q) = R{$)}. Then the concavity of 
/?(•) implies that /{(■) is continuous and hence that revenue is max¬ 
imized at q*. Now suppose that the firm must charge two prices to 
maximize profits. Then proposition 2 implies that there is an m E 
(-x, oo) an( j an € > 0 such that/ity) = R{q*) + m(q - q*) for any q G 
7* ~ q* + e]. Since q* maximizes profit, it follows that R(q*) 
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- C{q*) 2 R(q* - e) - C(q * - e) = R(q*) - mt - C(q* - «) and 
hence that C(q*) - C(q* - e) ss me. But then the concavity of C(-) 
implies that R(q* + t) — C(q* + e) = R(q*) - C(q*) + me - [C(^* + 
e) - C(q*)} 2 R(q*) - C(q*) + me - {C(q*) - C(q* - e)] 2 R(q*) - 
C(q*). Consequently, profits are also maximized at q* + e, contradict¬ 
ing the definition of q*. Q.E.D. 

Proposition 4 implies in particular that, if marginal cost is constant, 
a monopolist need never charge more than one price. 


VI. An Example with Two Prices 

For any demand function with points of discontinuity, there will be a 
range of output levels at which it is optimal to charge more than one 
price. We will consider here such a demand curve and contrast the 
optimal pricing policy and the corresponding revenue function with 
the pricing policy and revenue function that result when the firm is 
constrained to charge a single price. 

Consider the following demand function: 

D[p) = 0 for p > 2, 

D(p) =1 for 1 < p s 2, 

D(p) - 5 for 0 < p £ 1. 

If the firm is constrained to charge a single price, then it will earn a 
revenue of 2 q for q 5 1 and a revenue of q for 1 <qS 4. The marginal 
revenue is undefined at q = 1. Now suppose that we permit the firm 
to charge more than one price. Then for q £ 1, its optimal pricing 
strategy is to charge a price of 2 for all units. For 1 < q s 5, its optimal 
policy is to charge price 1 for the first 5 (q - l)/4 units and price 2 for 
the remaining (5 - <y)/4 units. The marginal revenue is 2 for the first 
unit and 3/4 for the next four units. 

The example is illustrated in figure 1. The demand curve is pre¬ 
sented in the top half of the figure. In the bottom half of the figure 
are the two revenue functions. Both the optimal revenue function 
and the single-price revenue function are identical for the first unit. 
At q = 1, the single-price revenue function R s has a downward dis¬ 
continuity, whereupon it increases with slope 1 up to q - 5. In con¬ 
trast, the optimal revenue function is found by connecting the single- 
price revenue at q = 1 with the single-price revenue at q = 5 with a 
straight line. Over this range of outputs, the firm will charge a price of 
1 for some units and a price of 2 for the other units. 
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Fit'.. I.—Optimal rwo-prire revenue vs. the single-price revenue 


VII. The Welfare Implications 

Since the single-price policies are always available, the profits of the 
monopolist must be higher whenever it strictly prefers a two-price 
policy. For the consumers, however, the welfare implications are 
more ambiguous. Suppose, for instance, that it is optimal for the firm 
to use a two-price policy, and consider the corresponding single-price 
policy that results in the same level of aggregate sales. Since some 
consumers are rationed at the lower price when the two-price policy is 
used, the net benefit to consumers must be higher under the single¬ 
price policy. Moreover, as demonstrated in proposition 5, the loss to 
consumers must be greater than the gain to the monopolist. 

Proposition 5. The profit-maximizing two-price policy results in 
higher social net benefit than the profit-maximizing single-price pol¬ 
icy only if it results in a higher level of sales. 

Proof, Let ((pi, q t ), (pi,, q h )) be the profit-maximizing two-price policy 
and let 


B 2 = 


= (3 


Dt 


D(x)dx + 


(9i\ 

\D,) 


D(x)dx + R(qi + q h ) - C(q, + q h ) 
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tie the net social benefit from this policy. Similarly, let (p x , q t ) be the 
profit-maximizing single-price policy and let fit = fJ, D(x)dx + 

- C(q s ) be the net social benefit from the single-price policy. Then, 
using equation (5) and the fact that R(q t + ?*) = ptqt + p/,qb we obtain 


B * - fi, 



(Ph i Ul \ tt>< 

D{x)dx + £ D{x)dx 

p. W Pi 


+ [RUji + q h ) - C(q, + ?,,)] - [R s (q,) ~ £(?.<)] 
s -qh(ph - P<) + qiips - Pt) + [R(qi + qh) - C(qt + q/,)i 
- [*,(*) - (%)] 

- 1 PMi + qh) - C(qi + qi,)] - [«,($) - £(?.)]• 


Then, since qi + qh - q, implies R,(q ( + q h ) 2 p x (qt + q h ) and since is 
the profit-maximizing single-price output level, it follows that 


fi, - fi, < mqt + qh) - C(q, + q k )] - [/?,(?,) - Cfo)] S 0. 
Q.E.D. 

In general, the level of output under the optimal two-price policy 
may be either larger or smaller than the level of output under the 
optimal single-price policy. 

Consider, for example, the demand curve of Section VI with cost 
function C{q) - aq 2 . In this case, the allocation of goods among con¬ 
sumers is efficient even in a two-price system since all the demand at 
price 2 is satisfied if any units are sold at price 1 /’ Consequently, net 
social benefit will be higher under the pricing regime that results in 
the highest level of output. For 3/8 < a < I, both the single-price and 
the two-price monopolist sell one unit at price 2. For (I - 2 |/2 )/4 < a 
< 8/3, the single-price monopolist continues to sell one unit while the 
two-price monopolist increases its sales to (3/8a). For a < (1 - 2 1 / 2 )/4, 
the two-price monopolist continues to sell (3/8a) units while the sin¬ 
gle-price monopolist increases its sales to (l/2a) units at price 1. Thus, 
depending on the value of a, either pricing regime may be the more 
efficient. 

One word of caution. This analysis supposes that the presence of 
rationing does not affect the consumers’ decisions on when they visit 
the store. However, if consumers know that they will be served on a 
first-come-first-served basis, there is an obvious incentive to visit ear¬ 
lier than they would have otherwise. As long as there is no relation 
between the shape of their demand curves and the timing of their 


5 When the firm charges a single price of I, it is appropriate to suppose that the high 
reservation demand is completely satisfied since it would lie if the demand curve were 
strictly decreasing throughout. 
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visits, the decision problem of the firm will be unaffected. However, 
to the extent that the decision to move earlier is costlier to consumers, 
it imposes an additional cost that should also be included in the wel¬ 
fare analysis. 

VIII. Some Applications 

Does this model ever explain the pricing policy of a firm? It is not 
enough simply to find examples in which the firm charges different 
prices for the same good, nor even where it rations the supply at 
random. Both the demand curve and the cost function must satisfy 
certain properties. First, the marginal revenue function of the firm 
must be increasing over some range of output, which means that the 
demand curve must look something like a step function. This condi¬ 
tion is most likely to be satisfied if we view the population as contain¬ 
ing two or more relatively distinct types so that the distribution of 
reservation values has more than one peak. There must also be some 
practical limits on capacity so that the marginal cost is increasing at 
the profit-maximizing output level. 

One plausible candidate for an application of this model is the 
pricing policy of airlines. Through various restrictions on the condi¬ 
tions of the sale, airlines are effectively able to segment their demand 
into groups with very different elasticities of demand. Round-trip 
tickets in which the customer leaves and returns on the same weekday 
are frequently more expensive than round-trip tickets with a layover 
of a week or more. The same-day tickets are typically purchased by 
businessmen with a relatively low elasticity of demand, while the other 
tickets are more likely to be purchased by vacationers with consider¬ 
ably higher elasticities. Even within the latter group, however, there 
may still be some heterogeneity that can be exploited only by ration¬ 
ing a limited supply of low-price super-saver fares. 

Except that the consumer must purchase the ticket a week or two in 
advance and cannot change his travel plans, the lower fares offer 
exactly the same service as the regular fare. Although these restric¬ 
tions may explain why super-saver fares must be sold at a discount, 
they do not explain why the supply of super-saver fares is rationed. If 
the airline is not trying to exploit the heterogeneity of its consumers, a 
more sensible policy would be to raise the price of super-saver fares 
until the supply equals demand. What I suspect is happening is that 
the airlines realize that many purchasers of super-saver fares are 
willing to purchase tickets at a higher price, but it has no effective way 
; of identifying this population. By rationing the supply of super-saver 
[fares and selling the remaining tickets at a much higher price, it 
nakes more money than it would if it raised the price of every ticket. 
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Another application of the model is suggested by the work of Katz 
and Nelson (1986) on the Federal Trade Commission’s (FTC) “super¬ 
market unavailability rule." This regulation prohibits a firm from 
advertising a good unless its supply of the good is sufficient to meet a 
“reasonable" demand at the advertised price. Presumably, the inten¬ 
tion is to eliminate “bait and switch” tactics in which the firm lures 
“naive” consumers into the store at the low price and then forces them 
to purchase the product at a higher price. T his model suggests that 
such a policy might be profit maximizing even with fully informed 
rational consumers if the firm is trying to exploit differences in the 
reservation values of its customers. In this case, the welfare analysis 
above suggests that, although there is some bias toward imposing a 
single price rule, there may be instances in which the net social gain 
and possibly even the welfare of the average consumer are reduced by 
eliminating the pricing practices proscribed by the FIX) regulations. 
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The paper develops a model that shows the effects of rational expec¬ 
tations, and of efficient markets, on public utility regulation. It is 
shown that the feedback from investor expectations to regulatory 
behavior, together with investor expectations that take account of 
this feedback, basically alters the consequences of regulatory deci¬ 
sions. The analysis examines the effects of a deviation between the 
allowed rate of return and the cost of capital, with both perfect and 
imperfect investor foresight. It also assesses the consequences of 
differing expected growth rates. Conclusions are drawn for the ef¬ 
fects of regulatory decisions on resource misaiiocation and of regula¬ 
tory lag on incentives. 


1. Introduction 

I'he objective of this paper is to examine the consequences of rational 
expectations, and of efficient markets, on public utility regulation. We 
show that investor expectations about the allowed rate of return have 
he effect of altering, fundamentally, the nature and economic impli- 
:ations of regulatory decisions. More specifically, we outline a model 
jf regulatory decisions that explicitly encompasses two interrelated 
dements: first, the effect of feedback from investor expectations to 
regulatory behavior and, second, the role of investor expectations 
hat take account not merely of the initial intentions of regulatory 
igencies but also of their response to feedback. 

It is helpful to place our analysis, though only briefly, in the context 
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of the highlights of past discussion of rate-of-return regulation. We 
start with the early textbook versions of current theory as presented 
by Bonbright (1961) and Phillips (1965). The approach assumes a 
regulatory constraint imposed by commissions based on an allowed 
rate of return to be earned on the “rate base” of an accounting book 
value of assets. The allowed rate of return is assumed either to equal 
the cost of capital or to be slightly above it to allow a margin for error. 

On this foundation, Averch and Johnson (1962) developed their 
hypothesis of resource misallocation (overcapitalization) given a sys¬ 
tematic (positive) deviation of the allowed rate of return from the cost 
of capital. Baumol and Klevorick (1970) carried the analysis further, 
explicitly modeling the effects of regulatory lag with cost-based regu¬ 
lation. They showed the existence of incentives for cost-saving inno¬ 
vations in the context of regulation that responds to changes in costs 
only with a lag. And Bailey and Coleman (1971) showed that regula¬ 
tory lag may reduce incentives for overcapitalization. 

The Averch-Johnson hypothesis led to numerous attempts at an 
empirical test, though, thus far, with generally inconclusive and con¬ 
tradictory results. There is, however, a growing body of empirical 
evidence, for example, Hagerman and Ratchford (1978) and Navarro 
(1983), that points to the response of commissions in their regulatory 
decisions to aspects of investor behavior. These results are consistent 
with our assumption of regulatory feedback, which in turn leads to 
the conclusion that the implications of earlier models require modifi¬ 
cation. 

The essential structure of our argument starts from the proposition 
that regulatory commissions must take account of change in the cost 
of capital in setting public utility prices, even if they choose a rate of 
return that deviates from the measured cost of capital. Market expec¬ 
tations, including expectations about regulatory policy, enter into de¬ 
termining the measured cost of capital (through their effect on secu¬ 
rity prices), and this in turn eventually affects the allowed rate of 
return. The implied circularity is reminiscent of the old discussion of 
the “fair value” rate base, though that concerned the rate base rather 
than the cost of capital. 

The paper is organized as follows: Section II discusses the feedback 
mechanism that results from the relation between security prices and 
the measured cost of capital. Section III develops a model in which 
security values based on rational expectations are indirectly incorpo¬ 
rated into regulatory decisions, while Section IV examines the conse¬ 
quences of relaxing the assumption of perfect investor foresight. 
Finally, Section V provides some concluding remarks on the implica¬ 
tions of our results for resource misallocation and for the effects oi 
regulatory lag on incentives. 
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II. The Feedback Mechanism 

The two methods most commonly used by regulatory commissions 
for estimating the cost of equity for rate-of-return regulation direcdy 
depend on share prices. They are the discounted cash flow method 
(DCF) and a method based on earnings/price ratios (E/P). The for¬ 
mer is measured as the current dividend yield (the ratio of dividends 
to current share price) plus the anticipated growth in dividends per 
share. A recent survey by Dukes and Chandy (1983) found DCF to be 
the most frequently used technique. The E/P ratios, a more restrictive 
version of DCF, use the ratio of earnings per share to current share 
price. 1 

Any change in investors' expectations is reflected in the relation of 
current valuations to past earnings and dividend history, with the 
result that the estimate of the cost of equity is inversely related to the 
direction of change in share prices. 

Commissions sometimes also take evidence on the “comparable 
earnings criterion,” which examines the ratios of earnings to book 
value for a group of firms with risk characteristics believed to be 
similar to those of the company in the rate case. 2 Still another tech¬ 
nique with which evidence is sometimes presented in regulatory pro¬ 
ceedings is the capital asset pricing model (CAPM). The allowed 
return is set as the risk-free rate plus a risk premium based on system¬ 
atic risk. 

A commission could, in principle, ignore the level of security prices 
by invoking the CAPM method or by arbitrarily selecting what it 
deems an appropriate rate of return. Alternatively, it could tem¬ 
porarily ignore current share prices on the assumption that the cur¬ 
rent rate at which earnings are capitalized does not reflect the “long- 
run” cost of capital. But any deviation between the allowed rate of 
return and the cost of capital measured with current share prices has 
a crucial consequence. Share prices will be a negative function of the 
growth in the stock of physical assets if the allowed rate of return is 
less—and a positive function if it is more—than the cost of capital so 
measured. It is inevitable, therefore, that at least with growing de¬ 
mand and new capital requirements, the effect of regulatory policy on 


1 While public utility commissions usually do not identify explicitly a single estimating 
procedure, one can infer from the decision what types of evidence were given primary 
weight. For example, E/P ratios appear to have been decisive in the Duquesne Light 
case in 1972 (97 PUR 3d 227) and Detroit Edison in 1970 (83 PUR 3d 463), while the 
DCF method appears to have been the primary basis for Cincinnati Gas and Electric in 
1974 (7 PUR 4th 138) and Consumers Power in 1978 (25 PUR 4th 167). 

Examples of the use of this criterion are the Florida Power case in 1972 (98 PUR 3d 
113) and Houston Lighting and Power in 1980 (36 PUR 4th 94). 
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share prices must ultimately feed back into regulatory decisions to 
maintain requisite financing for expansion. 3 

For the feedback mechanism to operate in the manner we hy¬ 
pothesize, capital markets must be efficient in at least the semistrong 
sense; namely, all information known to investors must be immedi¬ 
ately reflected in security prices. We first develop the model on the 
assumption of rational expectations. That is, we assume that investors 
know the regulatory reaction function of the allowed rate of return to 
share prices. The stringent condition of perfect investor foresight 
about regulatory decisions is then relaxed, and the analysis proceeds 
on the basis of capital market efficiency in the sense above; that is, 
investor expectations, whatever they may be, are immediately 
reflected in market valuations. 

III. The Rational Expectations Model 

We start with an equilibrium model of the market value of equity and 
allowed rate of return for a public utility, where the two variables are 
determined simultaneously as 

V - P (r + 8 )Be~ p, dt « ( - tM , ( 1 ) 

Jo P 

where V, B, and p are the market value, book value, and cost of equity 
capital, respectively; r = r(Vj is the estimated cost of equity and the 
source of the feedback mechanism. A change in the market value of 
equity has an inverse effect on the allowed rate of return ( drldV < 0) 
as with the DCF or E/P techniques for estimation of the cost of equity. 
The estimate r depends on V and, hence, implicitly on p. Since p 
cannot be observed direcdy, the regulator’s information about p is 
limited to its proxy, r. Set by the commission, r + 8 is the realized rate 
of return, and 8 may be positive or negative. The sign of the devia¬ 
tion, 8, is affected by a number of variables, among which are (1) the 
safety margin over the estimated cost of capital allowed by commis¬ 
sions, (2) economic profit arising from regulatory lag and from the 
consequent delay in reducing utility rates following cost-saving tech¬ 
nical change, (3) deficiencies in returns as the result of inflation and a 
regulatory lag-induced delay in adjusting utility rates to changes in 
costs, and (4) arbitrary rate-making policies of commissions. 

For simplicity, we initially assume (but later relax the assumptions) 
that there is only one rate hearing rather than a succession of hear- 


3 It is not surprising, therefore, that regulatory concern about short-run share price 
movements is clearly reflected in rate hearings. Examples of this may be found in 
hearings for Consolidated Edison in 1964 (54 PUR 3d 43) and again in 1970 (85 PUR 
3d 276) and for Boston Edison in 1974 (6 PUR 4th 77). 
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ings and that there is no growth in the firm’s stock of physical capital. 
Implicit also in our analysis is the assumption that the cost of capital is 
independent of leverage. The latter is based on the assumed applica¬ 
bility of the Modigliani and Miller (1958) theorem when the tax ad¬ 
vantages of debt financing are offset, as with public utility regulation, 
by the pass-through of taxes into consumer prices. 

With rational expectations, investors perceive the effect of V on r. 
Because of the simultaneous relation between these two variables, the 
following chain rule produces the ultimate effect of a change in 8 : 

dV _ W__ , 8V dr dV _ B , B dr dV „ 

db d8 dr dV db p p dV db' ( ’ 

Simplifying, we get 

rf 8 p - B(drldV) p' w 

In the absence of the feedback mechanism, B! p is the impact on the 
market value of equity of a change in 8 . The feedback of V to r, 
however, attenuates any positive or negative deviation from a cost of 
capital return. 

A description of the movement to an equilibrium may be helpful. 
Suppose, for example, that 8 is expected by investors to be negative in 
an inflationary period. As V declines in response to 8 , r rises and 
dampens the decline in V. The attenuation of the deviation is only 
partial, however, since a net decline in V must first fie observed in 
order to feed back positively to r. The more elastic the feedback 
function, the greater the attenuation. 

An interesting result is derived when the E/P ratio is used as the 
estimator of the cost (if equity capital . 4 Assume, for simplicity, that r 

- P for all periods prior to the current rate hearing. It follows that nE 
= p B, where n is the number of shares, E the annual earnings per 
share, and pfl the zero economic profit level of earnings. Since nP 

- V by definition, 


r 


E _ p B 
P ~ V ' 


Combining with equation (3) and rearranging, we get 


dV 

d(bB) 


s = n 


_ 1 _ 

2 p 


( 4 ) 

( 5 ) 


The no-growth assumption implies that the DCF and E/P estimates are equivalent 
since all earnings are paid as dividends. For simplicity, the E/P ratio is used throughout 
for illustrations. The same substantive conclusions result from the DCF model, even 
with growth, became both estimation techniques are inversely related to the value of 
equity and are associated with roughly the tame elasticities. 



182 JOURNAL OK POLITICAL ECONOMY 

since V = B when the derivative is evaluated at 8 = 0. Given that 1/p is 
the market value of a marginal change in annual earnings, equation 
(5) yields the result that one-half of the anticipated deviation of the 
market value of equity from its level with a cost of capital rate of 
return is attenuated by the feedback mechanism. Given the same 
assumptions, an identical result is derived for any difference between 
the realized rate of return and the cost of capital (r + 8 - p), 

d(r + 8 - p) = dV + db = Vidh. (6) 

In practice, the degree of attenuation will depend to some extent 
on which method of measuring the cost of capital regulators use. 
Whichever approach is used, however, the effect is important and 
greatly limits the discretionary authority of regulatory commissions to 
set rates of return that deviate from the cost of capital. Moreover, the 
alleged incentive effects of regulatory lag in bringing about cost¬ 
saving technical change and the adverse effect of regulatory lag on 
investors when inflation is present are both greatly reduced given 
investor foresight. 

Thus far we have used the simplifying assumption of a single rate 
hearing. For consistency with reality, ihe model must now be gener¬ 
alized for the feedback effect of multiple successive rate hearings. 
Again for simplicity, assume two hearings, one current and the other 
at 7V If we define 

a = 1 - e ' ?T \ 0 < a < 1, (7) 

then the (current) market value of equity is 

V„ = tt MM + (1 - a )V,. (8) 

P 

where r 0 = r 0 (Vo) is the cost of capital as estimated in the current rate 
hearing; V] = [(r, + S)fl]/p is the market value of equity at 7'i, where 
r, - rj (V,) is the cost of capital estimate for rate-making purposes at 
T\. The coefficient (1 - a) converts V] to its present value. 

Given the feedback of V» to r 0 and of V, to r, and, further, in¬ 
asmuch as V] is independent of Vo since it depends only on events 
beyond T,. the appropriate chain rule to derive the impact of a 
change in 8 is 

ilo = 1^2. + dVoJro_dVo . fVodVj. , 9) 

d8 58 5r 0 dV 0 db 5V, db ' ' 

Combining equations (8) and (9) and substituting equation (3) for 
dV\!db, we get 
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dv0 = _ B 

dh p - aB(dro/dV Q ) 

+ (1 " [p - aB(dr n /dV 0 ))[p - WuidVT)V 


( 10 ) 


The impact of feedback on the allowed rate of return may be 
ireater or less with two successive rate hearings than with one, de¬ 
fending on the parameters in equation (10). However, with an E/P 
-ado as the basis for estimating the cost of capital, an unequivocal 
result follows. Given the initial condition of 8 = 0 and V - B, the 
feedback mechanism may be written as the constant drldV = - p IB 
from equation (4). When we assume the same response funcuon for r 0 
and r\ and substitute into equation ( 10 ), the solution simplifies to the 
same result of l/2p in equation (5). In addidon, with the same assump¬ 
tions, this result is easily extended to any number of rate hearings. 
With full investor foresight and zero growth in assets, an E/P rule for 
measuring the cost of capital results in attenuation, by the feedback 
mechanism, of one-half the market value of anticipated deviations 
f rom a cost of capital return, and this conclusion is independent of the 
number of subsequent rate hearings. 

Intuitively, this result is based on the functional form of an E/P 
ratio as the reciprocal of the market value of equity. Two opposing 
forces, ihe market value of equity (V) and the allowed rate of return 
(r), are operative. If the commission strictly follows an E/P ratio for 
changing r (even when 8 ¥= 0 ). it follows that the elasticity of regulatory- 
response (the feedback) is the same as the elasticity of investors’ valua¬ 
tions. Accordingly, V and r move in opposite directions until a mid¬ 
point solution is achieved. 

Thus far, we have assumed zero growth in assets. Allowing for 
growth augments the attenuation effect of the feedback mechanism. 
In an efficient capital market, current shareholders capture the antici¬ 
pated net present value (NPV) of future investment. For a given 8 . the 
deviation from B in the market value of equity will be greater if there 
is growth in the asset base, resulting in additional feedback to the 
allowed rate of return. Incorporating growth into the single rate 
hearing model demonstrates this outcome. 

With exponential growth at rate g in the book value of equity, B{1) 
~ Be* 1 (not to be confused with the dividend growth rate in the 
standard DCF model), the discounted present value of earnings for 
both current and future shareholders is 


V e = fV p, (r + b)Be g, dt = 
So 


(r + B)B 


P ~ g 


(11) 
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With perfect foresight, share price falls to zero the instant the con¬ 
dition in equation (17) is recognized by investors. More realistically, 
investors might, at first, expect regulatory feedback then gradually 
revise expectations of the elasticity of regulatory response downward 
as the regulator continues to refuse a sufficient increase in utility 
rales. It can also be shown that share prices approach zero asymptoti¬ 
cally even when r 4- 8 > g, given growth and an allowed return below 
the cost of capital. Intuitively, this results from the dilution of earn¬ 
ings per share over time as a larger number of shares per period is 
required to finance negative NPV. 

The regulator is equally constrained in allowing a substantial mar¬ 
gin above the cost of capital. Current shareholders would capture the 
anticipated NPV of growth as a result of the margin. This implies that 
the regulator, by ignoring the feedback mechanism, permits the time 
path of share price to rise as a direct function of 8 and g. 

Moreover, an attempt to maintain 8 stimulates further growth (rais¬ 
ing g) because the marginal efficiency of investment, from the firm’s 
standpoint, remains level. If the social marginal efficiency of invest¬ 
ment is declining, rising costs associated with above-optimal growth 
are implicitly passed through to consumers, requiring repeated com¬ 
mission-mandated rate increases to maintain 8. This process can con¬ 
tinue until the elasticity of demand (even with monopoly) is high 
enough to reduce the increment to total revenue from a price in¬ 
crease (induced by a rise in the rate base) to an amount less than a cost 
of capital return on the added investment. 

From the standpoint of political constraints, the succession of rate 
increases is unlikely to go this far. The regulatory commission will 
have to either reduce the allowed rate of return or else restrict invest¬ 
ment by limiting what is allowed in the rate base. The Averch- 
Johnson model, in ignoring both the feedback mechanism and the 
effect of growth (with a positive 8) on share prices, therefore specifies 
a nonequilibrium relationship. 


V, Summary and Conclusions 

The impact of regulation on resource allocation, incentives, and in¬ 
vestor expectations as reflected in security prices has been extensively 
discussed in economic literature. Causality in the opposite direction— 
that is, from investor expectations to regulatory decisions—has re¬ 
ceived little attention. We have tried to show that the allowed rate of 
return depends partly, in an inverse relation, on security prices and, 
hence, on investor expectations. Without drastic consequences devel¬ 
oped in the papier, a regulator cannot ignore for long current security 
prices in rate-of-return decisions. More specifically, the principal con- 
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elusions from our model are the following. (1) If a deviation (8) be¬ 
tween the realized rate of return and the measured cost of capital is 
anticipated by investors, it will lead to a movement in security prices 
that is a positive function of the deviation. This will feed back, in¬ 
versely, to the allowed rate of return, thus attenuating the initial 
deviation. (2) The degree of attenuation of 8 depends not only on the 
elasticity of regulatory response to changes in security prices but also 
on the extent to which the regulatory response is foreseen by inves¬ 
tors. If the feedback is underestimated, the effect of 8 on share prices 
will be greater, and, in consequence, so will the extent of attenuation. 
(3) Assuming exogenous growth in demand and a requirement that 
utilities invest to satisfy demand, share prices will be a positive func¬ 
tion of rate of growth when 8 is positive and a negative function when 
it is negative. As a result, growth augments the attenuation of 8. 

These conclusions, in turn, have implications (a) for the alleged 
existence of an Averch-Johnson effect and (b) for the consequences 
both of regulatory lag and of intended deviations between the mea¬ 
sured cost of capital and the allowed rate of return. These are as 
follows. (1) A positive 8 produces a rise in endogenous investment, as 
specified in the Averch-Johnson model, but this does not end the 
process. For a rise in investment under these circumstances raises 
share prices and hence 8. This, in turn, feeds back to more investment 
at an inefficient level and, hence, to commission-mandated rate in¬ 
creases. Accordingly, a fixed, positive 8 is a practicable regulatory pol¬ 
icy only with external constraints on investment (e.g., through limits 
an inclusion in the rate base). (2) To the extent that regulatory lag is 
foreseen by investors, it will be reflected in share prices and, hence, 
will feed back to a lower allowed rate of return with cost-saving inno¬ 
vations and to a higher one with inflation. As a result, the favorable 
effect of a lag on incentives to innovate will be offset at least partly 
tnd quite possibly in its entirety. Conversely, the adverse effect on 
public utility shareholders of a regulatory lag with inflation will be far 
ess, and possibly zero, in the presence of investor foresight and regu- 
atory feedbacks. 

In sum, investor foresight reduces the power of regulators for good 
is well as for harm. 
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The causes and consequences of state maximum hours legislation 
for female workers, passed from 1848 to the 1920s, are found to 
differ from a recent interpretation. Although maximum hours legis¬ 
lation served to reduce scheduled hours in 1920, the impact was 
minimal. Curiously, the legislation appears to have operated equally 
for men. Legislation affecting only women was symptomatic of a 
general desire by labor for lower hours, and these lower hours were 
achieved in the tight, and otherwise special. World War 1 labor mar¬ 
ket. Most important, the restrictiveness of the legislation had no 
adverse effect on the employment share of women in manufac¬ 
turing. 


I'he development of . . . state legislation on hours has 
tended to follow a very definite pattern. An insistent 
and persistent demand for general legislation to insure 
shorter hours for all led to the passage of general eight- 
hour laws. When statutes of such unrestricted applica¬ 
tion proved unavailing, attempts at hours' regulation 
concentrated on specific classes of employees. [Cahill 
(1932) 1968, p. 94] 

State laws mandating daily and weekly maximum hours of work ap¬ 
peared as early as 1848, and by 1921 all but four states had passed 
such legislation. Half the states adopted their first laws during the 
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initial two decades of this century, and 40 states passed some form of 
hours legislation during the second decade. While the precise number 
of hours and the details varied by state, one aspect was common to all: 
the laws applied almost exclusively to the employment of women. 1 2 
General hours legislation had been declared unconstitutional by sev¬ 
eral states and eventually by the Supreme Court in New York v. Lochner 
(1905), but in the now-famous case Muller v. Oregon (1908) the Court 
finally established the constitutionality of maximum hours legislation 
for women. At the time of their passage, maximum hours laws re¬ 
ceived mixed reviews. To their champions the laws would serve to 
protect women from their employers; however, others predicted that 
they would result in reduced female employment. 

The motivation for and impact of protective legislation have re¬ 
ceived renewed attention. In her study of maximum hours legislation, 
Elisabeth Landes (1980) concluded that it greatly reduced hours 
worked per week by women in 1920 and lessened their employment 
share in manufacturing, the major covered sector. Furthermore, the 
reduction in employment was most pronounced among the daughters 
of the foreign-born, and the passage of this legislation was, by infer¬ 
ence, supported by native-born, male manufacturing workers, who 
stood to gain the most. 

This paper reassesses the interpretation of maximum hours legisla¬ 
tion and introduces new information on its impact. The findings dif¬ 
fer substantially from those of Landes.' Although maximum hours 
legislation reduced scheduled hours, the impact was minimal and it 
operated equally for men. The reasons for this apparently curious 
result are clear from the epigraph. Legislation affecting only women 
was often symptomatic of a general desire by labor for lower hours, 
and these lower hours were achieved in the tight, and otherwise spe¬ 
cial, World War I labor market. Legislation affecting only women was 
but one way labor sought to build a coalition in support of reduced 
hours. 3 It is important at the outset to point out that states that passed 
legislation did not always have lower scheduled hours; there is no 

1 Mississippi (in 1910) and Oregon (in 1920) passed legislation covering men, and the 
Georgia law covered all textile workers (see Cahill 1968). Many other states attempted 
to pass general legislation but were thwarted by various stale supreme courts, except 
when the laws explicitly allowed contracts for more than the maximum number of 
hours, rendering them virtually useless. 

2 It should be noted that Jones's (1975) analysis of the impact of maximum hours 
legislation on hours of work concluded that it played no role in the decline in hours 
from 1909 to 1919. 

* I have not yet explored whether labor was constrained by existing hours in the pre- 
World War I period. In her study of iron and steel, Shiells (1986) concluded that these 
workers were constrained. Reduced immigration in the early 1920s lowered the median 
worker's supply of hours, but labor in the industry was insufficiently organized to effect 
change. The large reductions in hours of work in general over the 1909-19 period 
suggest that workers were constrained. 
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relationship between scheduled hours of work by state in 1909 and 
subsequent legislation. 4 

Most important, the restrictiveness of the legislation had no effect 
on the employment share of women in manufacturing. Its restric¬ 
tiveness was, on the contrary, associated with a positive impact on the 
employment share of women in sales (another covered sector). Fi¬ 
nally, higher labor force participation of women across cities during 
the 1920s was strongly correlated with shorter hours of work per day, 
consistent with one time-series explanation for the increase in female 
market work and with recent cross-section evidence. 

Some of the differences between my findings and those of Landes 
are rooted in different specifications, while some are founded in new 
evidence on hours of work. None of the differences between this 
paper and the Landes article detracts from the contribution of the 
original, which was to highlight an important and often forgotten 
part of the history of hours of work, female employment, and protec¬ 
tive legislation. 

The resolution of the impact of hours legislation is particularly 
relevant with the recent passage of comparable worth legislation and 
with renewed interest in the political economy of “rent-seeking” be¬ 
havior. Various types of protective legislation, such as child labor 
laws, compulsory attendance, and pay equity, which were once viewed 
as humanitarian in origin, have also been reinterpreted as directly 
benefiting certain groups, but not the ones to which the legislation 
directly applied.'’ 

Maximum hours legislation may be relevant, as well, to understand- 
ingjthe long-term decline in hours of work over the course of this 
century. The findings here concerning the relationship between the 
decline in male andi female hours necessitate further study of the 
decline in hours in general. The average scheduled workweek in 
manufacturing fell by almost 9 hours from 1900 to 1920, or by one 
full workday. 6 Much of the decline occurred in the brief period from 

' The possibility that hours legislation was later passed in states in which hours had 
already declined, and therefore in which there was less opposition, was confronted and 
disproved by Landes. She found that "state differences in hours worked did not exist 
prior to legislation” (1980, p. 481, n. 9). I have confirmed these results by a more direct 
test and have found that among the stairs that did not pass their first law before 1909 
(28 states), those that passed their first law from 1909 to 1914 did not have lower 
scheduled hours in 1909. However, these states did have hours lowered by 1.65 com¬ 
pared with the other states in 1919, consistent with the analysis in this paper. 

’ On child labor laws in Britain, see Marvel (1977). Note that W. Landes and Solmon 
(1972) interpreted the passage of compulsory schooling legislation as coming after most 
children were in school for the legislated amount of time. Neither E. landes nor 1 have 
been able to find convincing empirical evidence to support the hypothesis for the case 
of hours (see n. 4). 

1 rends in actual hours are only slightly different from those in scheduled hours. 
Jones (1963). in a study of actual hours of work, found a decline of 7 hours from 1909 
to 1919. 
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1916 to 1920 (U.S. Bureau of the Census 1975, ser. D-769, p. 168) 
during and just after an outpouring of state hours legislation and 
coinciding with generally favorable economic conditions and trade 
union strength. 


I. The Impact of Maximum Hours Legislation on 
Hours of Work 

Did maximum hours legislation have an impact on the scheduled 
hours of women and, if so, by how much? Landes (1980) explored this 
question with state-level data from the 1919 Census of Manufactures, 
which aggregated male and female employees and gave scheduled 
(not actual) hours of work per week For firms. 

To separate the impact of hours legislation on male and female 
employees, an identity is estimated in which mean scheduled hours by 
state are regressed on the female share of employment in manufac¬ 
turing (PMFF20), a dummy variable equal to one if the state passed a 
maximum hours law by 1914 interacted with the percentage female 
(PMFF20 x DUM), and several variables (South dummy = SD, ur¬ 
banization — PURB) to account for differences in hours across states. 
By including only PMFF20 x DUM, the maximum hours law dummy 
multiplied by percentage female, the impact of hours laws is con¬ 
strained to fall entirely on female employees. The coefficient on 
PMFF20 x DUM is then the decrease in the number of hours worked 
by women in states with hours legislation. 7 

A more general specification would also include the dummy vari¬ 
able (DUM), the coefficient of which is the decrease in the number of 
hours worked by men in states with hours legislation. Then the coeffi¬ 
cient on PMFF20 x DUM is the difference in the decline in hours 
(due to legislation) of women compared with men, and the coefficient 
on the percentage female (PMFF20) is the difference between aver- 


7 The identiiy is simply 

H = <*/Hf + (J - a \/)H m + (a f x DUM 

where H is average scheduled hours, Hr is average scheduled hours for females in 
unconstrained states, H m is average scheduled hours for males in unconstrained states, 
DUM equals one if a state has a maximum hours law, a/ is the percentage of manufac¬ 
turing employment that is female, and is the marginal impact of hours laws on mean 
female hours and is expected to be negative. Rewriting yields the estimated equation 

H = H m + {H, - H m )*f + (H f fy)(a f x DUM). 

Thus the coefficient on o if is the difference between female and male hours in uncon¬ 
strained states, and the coefficient on cy x DUM is the decline in female hours in 
constrained states. 
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TABLE 1 

Impact of Hours Legislation on Scheduled Weekly Hours by State, 1920 


Independent 

Variables 

(1) 

(2) 

(3) 

(4) 

Constant 

53.3 

53.6 

54.4 

54.7 


(69.1) 

(76.9) 

(86.4) 

(55.7) 

SD 

1.72 

1.55 

1.69 

1.73 


(3.43) 

(2.72) 

(3.03) 

(3.01) 

PURB 

-.05 

-.059 

-.058 

-.059 


(3.72) 

(3.81) 

(3.95) 

(3.87) 

PMFF20 

.11 

.133 

.064 

.041 


(1.93) 

(2.20) 

(1.30) 

(48) 

PMFF20 x DUM 

-.08 

-.101 


.035 

DUM 

(181) 

(1.99) 

-1.47 

(35) 

-1.82 


.67 

.61 

(2.55) 

.63 

(1.56) 

.63 

Number of 
observations 

49 

49 

49 

49 


Sources. —Col. 1: Landes <1980, p. 480). Cols. 2-4: Hours data (IIRS19) arc from the 1919 Census of Manufac¬ 
tures, voi. 8, PMFF20, manufacturing employment data, are from the 1920Cfmui uf Population, vol. 4; Dl'M is from 
information in Landes (1980, uble 1) and L'.$. Women's Bureau (1931); PURB is from the 1920Cmtiu of Population, 

vol. 1. 

Note.—D ependent variable HRS19 “ mean scheduled hours in manufacturing in 1919 (mean » 52.2); SD 
*= dummy variaMe for southern stales; PURB » percentage of the state’s population (hat was urban in 1920; 
PMFF20 “ percentage of the manufacturing labor force that was female; Dl’M * I if the state passed its first 
enforceable maximum hours law by 1914. The means of the (unweighted) independent variables are; SI) * .91; 
PURB » 43.4; PMFF20 • 12.3; DUM * 673. Absolute values of /-statistics are in parentheses- Note that none of 
the equations has been weighted to account for hcteroscedasticity ui estimating an identity, however; see the text for 
a justification. 


age hours for females and males in the unconstrained states. The 
constant term is the unconstrained value of male hours. 8 

Various hours equations estimated across states (and the District of 
Columbia) are presented in table 1. Column 1 gives the results in 
Landes; column 2 reestimates the same equation following the vari¬ 
able construction in the original article and results in similar coeffi¬ 
cients. Column 3 omits the interaction of the dummy variable with the 
percentage of manufacturing labor that was female but includes the 
dummy variable. Finally, column 4 contains the least constrained esti¬ 
mation. 

Landes’s estimation of the more restrictive equation indicated that 
hours legislation decreased scheduled hours of women by 8 per week. 
Note that 8 hours per week was 15 percent of mean scheduled hours 

8 The more general specification adds to that in n. 7 a term for the impact of hours 
legislation on male hours: 

H = a (H, + (1 - a,)H m + (a, x DUMJfyfy + [(1 - <y) x DUMM-- 
Rewriting yields the estimated equation 

H ~ H m + (H f - H m ya, + (Hfa - HJS m )(Of x DUM) + (H m p.)DUM. 
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per week, almost one full weekday of work. Furthermore, the coeffi¬ 
cient on PMFF20 of 0.11 indicates that women working in states with¬ 
out maximum hours legislation worked a full 11 hours more per week 
than men did. It would be very surprising if states without restrictions 
had scheduled hours for women that exceeded those in covered states 
by a full weekday of work and if women in states without legislation 
worked the equivalent of over one full day more than men did. These 
conclusions, however, are not supported when a more general equa¬ 
tion is estimated and when more disaggregated data are used. 

Hours legislation served to decrease average (male and female) 
scheduled hours by about 1.5 per week, as can be seen by computing 
the average or by estimating the equation with the dummy variable 
not interacted, as in column 3.° The constrained estimation puts the 
full burden of the legislation on female employment. Thus the reduc¬ 
tion in female hours would have to be approximately 12 hours, which 
accounts for the estimate of 10.1 hours from column 2. 

Consider instead the estimation in column 4, which also includes 
DUM. The estimated coefficients reverse the earlier findings and sug¬ 
gest an entirely new interpretation of hours legislation. Hours legisla¬ 
tion is found to have had no differential impact on female hours. 
Instead, it reduced both male and female hours by about 1.8 hours. 

One may rightly question whether these results have been pro¬ 
duced by a misspecified model. 10 Because scheduled hours in the 
manufacturing census are also listed by industry as well as by state, the 
decline in hours can be estimated for industries hiring only male 
workers. The findings from the identity have been confirmed by an 
estimation across states for an industry in which there were virtually 
no female employees—foundries—and for groups of other male¬ 
intensive industries, to be discussed below. In fact, the reduction in 
scheduled hours of foundry workers in states with maximum hours 
legislation (once again, that cover only women) was virtually identical 
to that from the full estimation in table I. 11 

9 The percentage of the manufacturing labor force that was female was 12.3. Multi¬ 
plying by the coefficient on PMFF20 x DUM from col. 2 gives 1.24. 

10 As noted in the table, the equations were not corrected for possible heteroscedas- 
ticity problems inherent in estimating identities. 

11 The results for the foundry data (when the variable definitions and constructions 
in table 1 are used) are, first for an unweighted sample and then for a weighted sample, 

HRS19 = 54.9 - .046PURB - .095SD - 1.80DUM, R 2 = .38; N = 44; 

(61.0) (2.70) (.13) (2.18) 

HRS19 = 55.1 - .035PURB - .348SD - 1.89DUM, R 2 = .32; N * 44. 

(52.9) (2.04) (.42) (2.16) 

(Source: 1919 Census of Manufactures, vol. 8; see also table I.) Five states had insufficient 
employment in foundries to be listed in the census. 
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These findings suggest that protective legislation for women was 
associated with a decline in hours of work for men. The reason for 
this apparently peculiar result is, as suggested by the epigraph, that 
laborers in states that passed protective legislation for women had 
sentiments for decreased hours of work in general. They were able to 
lobby more forcefully for laws covering women whose plight ap¬ 
pealed to legislators, and state supreme courts did not, in general, 
challenge laws that applied only to women. Note that the analysis does 
not assess whether women were “hours constrained” in the presence 
of legislation or whether men and women were hours constrained 
prior to the large declines in hours from 1909 to 1919. Those are 
separate and difficult issues. 

The proposition that protective legislation was passed in states in 
which male labor lobbied vigorously for general hours reductions can 
be tested by using disaggregated data by industry for 1914 and 1919 
from the 1920 Census of Manufactures (vol. 9). For each state the two 
largest female-intensive industries and the two largest male-intensive 
industries were selected. In the former, females were, on average, 
about 50 percent of the labor force; in the latter, however, they were 
less than 2 percent of the labor force. Those in male-intensive indus¬ 
tries, therefore, could not have viewed female labor as a direct threat; 
these industries (e.g., lumber, foundries, steam car railroads) never 
contained many female employees (if any at all). 

Define MDIFF to be average scheduled hours for males in 1919 
minus average scheduled hours for males in 1914 and FD1FF to be 
the same for females. Let LIM14 be the existing weekly hours limit in 
1914 (with the zero limit set equal to 66 hours). Then 

MDIFF = -20.24 + .256UM14 + 1.39SD + .0362PURB. 

(3.65) (3.06) (1.61) (1.64) 

R 2 = .22, 

is obtained when estimated across the 49 states (including District of 
Columbia) and indicates that the 1914 hours limit is positively related 
to the decline in hours for males from 1914 to 1919. Therefore, the 
lower the limit, the greater the decline in male hours. However, for 
females 

FD1FF = 4.27 - .090LIM14 + .109SD - .0569PURB, 

(.85) (1.19) (.14) (2.85) 

R 2 = .17, 

indicates that the decline in female hours was not related to the exist¬ 
ing 1914 limit. These results, taken together, suggest that labor in 
male-intensive industries lobbied effectively for hours limits for fe- 
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males in states in which male laborers were ultimately successful at 
lowering their own hours. In many of these states the dominant male¬ 
intensive industry was lumber, in which the Wobblies led successful 
strike activity in the unique World War I environment (see Hidy, Hill, 
and Nevins 1963, pp. 332-51). Organized labor in male-intensive 
industries may have cared about female hours of work because the 
more laborers working shorter hours, the more it would become the 
norm for all. 12 

A further complication with the statement that hours legislation 
substantially reduced hours of work is that the manufacturing data 
refer to scheduled, not actual, hours. It is actual hours worked that 
are at issue. Data on actual hours of work by women are unavailable 
on a national basis for this period of time but exist for various states in 
the surveys of the Women's Bureau. 13 Scheduled weekly hours, actual 
weekly hours, and the hours laws in effect at the time of the survey 
are given in table 2 for states with complete information in the Wora- 
jn's Bureau bulletins. The data show that mean actual hours were far 
more similar across states than were mean scheduled hours. While 
states with the least restrictive hours legislation had the highest sched¬ 
uled hours per week, actual hours per week worked by female em¬ 
ployees in manufacturing and mercantile establishments were far 
awer than average scheduled hours. In Missouri and South Carolina, 
which had among the highest scheduled hours, actual weekly hours 
worked were only 82 percent of scheduled hours. In Illinois, New 
ersey, and Rhode Island, which had among the lowest scheduled 
lours, the ratio was over 90 percent. The elasticity of actual hours 
with respect to scheduled hours was 0.82. 14 Therefore, the difference 
i actual hours between states with legislation and those without was 
irobably even less than what the difference in scheduled hours indi- 
:ates. 

12 While there is no implication that female employees were unconstrained in states 
nth maximum hours legislation, I have not been able to estimate a decline in their 
ours (for the female-intensive industry sample) in states with restrictions (an exercise 
imilar to that in table 1). Male-intensive industry hours, however, were lower by 1.4 in 
tates with legislation, consistent with the foundry data. 

13 These states are not, however, a random sample; the Women's Bureau directed its 
fforts at states with higher than average scheduled hours that had requested surveys to 
ssist in evaluating their legislation or in formulating minimum wage standards. The 
Vomen’s Bureau did include several states (such as New Jersey and Ohio) that had 
estrictive hours legislation. Even with the biases in the sample, the unweighted mean 
f scheduled hours is 51.2 or exactly 1 hour below the unweighted mean of scheduled 
tours in the 1919 Census of Manufactures, which is the average for males and females. 

14 An equation estimated across the 15 states in table 2 yields 

log Actual Hours » .576 + .821 log Scheduled Hours; 

(.78) (4.40) 

'* *» .60; corrected ft* = .57; (-statistics are in parentheses. 
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TABLE 2 

Mean Scheduled Hours and Actual Hours Worked, by Selected States, 1920s 


State (Survey Date) 

Scheduled 

Hours 

(1) 

Actual 

Hours 

(2) 

(2)/(l) 

(3) 

Hours Law 
in Effect 
Daily/Weekly 
(4) 

Alabama (1924) 

53.9 

46.4 

86.1 

none 

Arkansas (1922) 

51.5 

47.7 

92.6 

9/54 

Delaware (1924) 

50.4 

41.1 

81.5 

10/55 

Georgia (1920) 

55.5 

47.9 

86.3 

10/60* 

Illinois (1924) 

49.0 

44.7 

91.2 

10/none 

Kansas (1920) 

43.4 

37.7 

86.9 

8/55 

Kentucky (1921) 

51.7 

45.0 

87.0 

10/60 

Mississippi (1925) 

55.6 

49.8 

89.6 

10/none 

Missouri* (1922) 

53.1 

43.5 

81.9 

9/54 

New Jersey (1922) 

48.4 

44.3 

91.5 

9/50 

Ohio (1922) 

48.4 

43.3 

89.5 

9/50 

Oklahoma (1924) 

51.1 

44.4 

86.9 

9/54 

'hode Island (1920) 

49.0 

46.0 

93.9 

10/54 

South Carolina (1921/22) 

54.6 

44.9 

82.2 

10/55/60* 

Tennessee (1925) 

52.8 

48.7 

92.2 

10.5/57 


SdL'Ra.—U.S. Women'* Bureau (1919-27). 

* Applies only to women working in cotton and woolen mills. 
f Data are for wliite women only. 

* 55 hour* applies to textile factories; 60 hours elsewhere. 


The close resemblance of the hours equation in table 1 to that in the 
original article indicates that the variables used are nearly identical. 
Thus the next stage of the empirical work, that on employment, using 
in almost identical set of variables should produce similar results. 
That, however, is not the case. 


[I. The Impact on Employment: Theoretical 
Underpinnings and Empirical Results 

The impact of maximum hours legislation on the employment of 
vomen in the covered sector provided the key result in the Landes 
irticle. The estimation was motivated by a model of a labor market 
hat predicted, under the most reasonable parameter values, that the 
emale share of the covered sector would decline with effective max- 
mum hours legislation. 

Landes posited a model of labor hours in which individuals choose 
lours of work (hi), say per week, as a function of their hourly wage 
w), hi = w**\ and then choose to supply their labor (L,), say in labor 
lays per week, as a function of their weekly earnings (wh,), L, 
= ( whi )*S where i — male (m) or female (/) workers. Workers are 
Jerfect substitutes for each other, and thus the wage is the same for 
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both groups. Total hours of labor services, S = 2 h,L„ are identical, in 
equilibrium, to firm demand: D - u>~ r> . Effective maximum hours 
legislation for females alters the equilibrium in two manners: the 
elasticity of female hours with respect to the wage ((./,,) automatically 
becomes zero, and hours of work for females are reduced if the 
constraint is binding. The effects of the reduction in female hours on 
the total supply of hours, the wage, male labor, male hours, and 
female labor are given by dS/dhf > 0, duildhf < 0, dLJdhj < 0, dhjdhf 
< 0, and dLfldhj § 0. 

Thus all effects are unambiguous in sign except for that on female 
labor, which is the derivative of interest. It depends on the relation¬ 
ship between the elasticity of demand for labor hours plus an elastic¬ 
ity-weighted male labor force share and the female labor force share, 
(•q + O m - Sj) % 0.' 5 As Landes points out, the most reasonable 
parameter estimates yield a positive impact. The female labor force in 
the covered sector would decline with effective maximum hours legis¬ 
lation. 

This key result was tested (and affirmed) in the Landes article by 
estimating the impact on the female share of the manufacturing labor 
force of hours legislation. Although the empirical results obtained by 
Landes are consistent with the predictions of the model, I will demon¬ 
strate that the results are highly sensitive to a change in the construc¬ 
tion of a key variable. The new construction is, I believe, more appro¬ 
priate given the legislation and the limitations of the available data. 
The estimation with the changed variable can be better explained by a 
lightly revised model, in which female labor supply, in days worked 
per week, increases with reduced hours worked per day. 16 The possi¬ 
bility that maximum hours laws could have expanded female employ¬ 
ment should not be surprising. It has been frequently asserted that 
emale labor force participation rates rose over the long run because 
cheduled hours of work per day declined, enabling women with re- 
ponsibilities at home to work more days. 17 

Alternatively, the amended results can be understood within the 
context of the related decline in the work hours of male manufactur- 

15 The term i' m - + €*„ + (t,^ x and .t, = LJiJS. 

10 This model is presented in a longer version of this paper (Goldin 1986) and is 
upported with evidence on the relationship between scheduled hours per day and days 
/orked per week from Women’s Bureau bulletins. These data suggest that women in 
tates with high scheduled hours per day reduced the number of days they worked per 
,'eek and that hours reductions per day served to increase the number of days worked 
*r week. 

17 As Durand (1948, p. 118) noted: "The secular decrease in weekly hours of work is 
jerbaps almost as important as the change in occupational composition of the demand 
or labor, as a factor in the increasing employment of women. The length of the 
.’orking week is especially important in connection with the availability of married 
vomen for jobs.” 
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lg laborers. Had hours of work not declined for men, the impact of 
ie legislation on the female employment share might have been 
Teater. 

In the employment equation estimated by Landes, the dependent 
ariable was the percentage of the total manufacturing labor force 
lat was female in 1920 (PMFF20), and the key independent variable 
ccounted for the degree of restrictiveness of the state’s maximum 
ours legislation (REST). Other variables were included to account 
>r differences in the demand for or supply of female workers, such 
s urbanization, region, and a lagged employment share in manufac- 
aring capturing a host of relevant factors. 

The restrictiveness variable measures the percentage of the state’s 
lanufacturing labor force in 1909 that worked (in actuality, the per- 
intage working in establishments that had scheduled hours) over the 
■gal maximum in effect in 1914. The variable accounts for prior 
inditions and gives the proportion of the labor force in 1909 that 
'ould be constrained by the hours legislation passed by 1914. Note 
lat the restrictiveness variable is highly appropriate for this exercise. 

is superior to a simple dummy variable indicating whether or not a 
.ate passed an hours law sometime between the year of the depen- 
ent variable (1920) and that of its lagged value (1900). Landes also 
lcluded a dummy variable (here DUM1905-14) if a state passed its 
rst enforceable maximum hours law between 1905 and 1914. 
Landes’s estimated regression, given in column 1 of table 3, indi- 
ates that states with more restrictive legislation had a lower female 
mployment share in manufacturing. Further estimations by Landes 
idicate that most of the decline in the employment share occurred 
jr the daughters of the foreign-born and for foreign-born women. 18 
hese results provided persuasive evidence that hours legislation was 
assed under the guise of humanitarian concern through the efforts 
f labor groups and others that stood to gain the most from restrict- 
tg the employment of immigrant women and their daughters. 

Note, however, the other regressions for the manufacturing sector 
ppearing in table 3. These were estimated on identical variables by 
ate, most of which were also in the table 1 estimation. Unlike those 
l table 1, there is little relationship between my results and those of 
•andes. Most important, the coefficient on the restrictiveness vari- 
le, here called WKREST, is generally positive but insignificant and 
lat on the hours legislation dummy variable is negative and barely 
jnificant. These results are relatively robust to restricting the sample 


18 Note that in the estimation for the daughters of tiie foreign-born, Landes got a 
ore significant coefficient on REST (her table 3, p. 487) than in the estimation for all 
omen. 
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TABLE 3 

Em:cr of Hours Legislation on the Employment Share of Women in 
Manufacturing and Sales, 1920 


Independent 

Variables 

PMFF20 

(Landes’! 

(1) 

PMFF20 

(2) 

PMFF20 

(3) 

PMFF20 

(Manufacturing) 

(4) 

PSF20 

(5) 

Constant 

-.00168 

-.013 

-.027 

.041 

.150 


(111) 

(1.28) 

(2.02) 

(3.53) 

(6.14) 

EMP_ i 

.79 

.753 

.804 

.829 

.772 


(9.66) 

(11.4) 

(10.4) 

(13.9) 

(8.26) 

SD 

.0005 

.010 

.014 

.007 

-.006 


(06) 

(116) 

(1-52) 

(.72) 

(.56) 

PURB 

.0005 

.0003 

.0004 

-.0001 

-.0005 


(2.26) 

(1.44) 

(1.81) 

(.48) 

(2.52) 

DUM 1905-14 

-.0012 

-.0157 

-.0178 

-.0130 

-.0072 


(.14) 

(1.83) 

(1.87) 

(1.23) 

(.82) 

REST 

- .0253 





WKREST 

(1.49) 

.0181 

.0125 

.0181 

.0215 

H' 1 

.83 

(139) 

.86 

(.86) 

.84 

(112) 

.90 

(1.63) 

.79 

Number of 
observations 

41 

49 

41* 

41 

41 


Soukcks. —Col. I: Candt* (1980, p. 484). Cols. 2-5: DUM1905-14 is constructed from data in Undo (1980, 
table 1) and U.S. Women's Bureau (1931) and was altered for estimation of col. 5; WKREST is constructed from the 
1909 Census of Manufactures, vol, 8, and U.S. Women's Bureau (1931); EMP-j is from I'.S. Bureau of the Cerous 
(1904) and the 19)0 Census of Population, vol. 4; PMFK20 and PSF20 are from the 1920 Census of Population , vol. 4; 
PMFF20 and £MP_, for col. 4 are from the 1909 Census of Manufactures, vol. 8, and die 1019 Census of Manufactures, 
vol. 8; PURB is for 1920 and is from the 1920 Census of Population, vol. I. 

Note.— For the dependent variables, the mean of PMFF20 is .123 (.144 for col. 4); the mean of PSF20 is .333 
PMFF20 - female employment share of manufacturing in 1920: PSF20 - female employment sliare in sales 
(salespersons and clerks in stores) in 1920; EMP - j » female employment share in manufacturing (sales for col. 5) in 
1900 (1910 for sales); DUM1905-14 « 1 if first enforceable maximum hours law in manufacturing (sales for col. 5) 
was passed from 1905 to 1914; WKREST (REST) » proportion of employees in 1909 who worked over the 
maximum number of weekly hours (for RF.ST it is the daily limit times six) in effect in 1914 (see text). Col. 1 divides 
all coefficients by 100 (except EMP - 1 ) because the numbers in Landes express the share as a percentage. Ordinary 
least squares estimation was used for consistency with Landes. A weighted logit transformation yields almost identi¬ 
cal slopes around the mean for the 48 states and District of Columbia and the nonmountain sample. Means for the 
entire 48 states and District of Columbia are: EMP. .156, SD, .306, PURB, 43.4; DUM 1905-14, .286; WKREST, 
.362. 

* The eight mountain states are excluded for consistency with the Landes estimation. 


to the 40 states (and District of Columbia) that are highlighted in the 
original article (i.e., excluding eight mountain states having few 
manufacturing workers), to weighting the regression by the square 
root of manufacturing employment in the state (not included in the 
table), and to estimating a (weighted) logistic transformation of the 
dependent variable (also not in the table). 19 


19 The original paper focused on the impact of hours legislation on the employment 
of native-born, foreign-parentage women and foreign-born women. The results of 
these estimations had yielded larger and more statistically significant coefficients than 
in the entire sample. My attempt to replicate these results did not yield significant 
effects on the key variable, WKREST. These equations have been estimated across the 
41 (nonmountain) states (including District of Columbia) used in the original article: 
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The source of the difference is the computation of the restric¬ 
tiveness variable (termed REST by Landes and WKREST here). Many 
states passed weekly hours laws that were more restrictive than the 
daily limit times six. The REST variable in the original article was 
computed using a weekly restriction that was always six times the daily 
restriction even for states with lower weekly limits, despite the fact 
that the 1909 data used to create REST were for weekly scheduled 
hours. That procedure produced estimates that differ from those 
using the weekly legislation in 12 states (or one-quarter of the sam¬ 
ple), including every New England state, Pennsylvania, Ohio, and 
Delaware. The WKREST variable differs from the REST variable, on 
average, by a factor of 10 across the 12 states. 

Although the WKREST variable is not significant in the new esti¬ 
mation and is even positive, not negative, the DUM1905-14 variable 
is negative and barely significant.* 20 States that passed their first hours 
law between 1905 and 1914 were latecomers and may have been 
different for other reasons. The PMFF20 variable uses data on occu¬ 
pations from the population census, not the manufacturing census, 
because only the population census differentiated workers by nativity. 
The population census, however, included nonfirm manufacturing 
workers and generally overstated manufacturing employment, partic¬ 
ularly in the less developed states. In many of the latecomer states, 
most of the female manufacturing workers in the population census 
were dressmakers. By 1920 the services of these workers had often 
been replaced by factory-produced goods. The estimation in column 

native-white, foreign-parentage (NF) women as a share of all NF workers in the manu¬ 
facturing labor force equals 

- .0060 + -849EMP., - .0263SD + .0008PURB 

(3.23) (10.2) (2.24) (2.82) 

- .0252DUM1905-14 + .0034WKREST; 

(2.07) (.18) 

ft 2 = .87; N = 41; mean of dependent variable = .165; 

foreign-born (FB) women as a share of all FB workers in the manufacturing labor force 
equals 

-.0087 + .753EMP.., + .0019SD + .0003PURB 
(.82) (11.0) (2.11) (1.21) 

- .0083DUM1905-14 - .00S6WKREST; 

(.84) (.23) 

/?* = .85; N = 41; mean of dependent variable = .085. 

(Sources: See table 3 and the 1920 Census of Population, vol. 4, for independent vari¬ 
ables.) 

20 It should also be pointed out that removing the WKREST variable reduces the 
significance of the DUM1905-14 variable and lowers its coefficient in absolute value 
for all estimations. 
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4 uses the manufacturing census data, and the coefficient on 
DUM1905-14, while still negative, is not significant. It is possible that 
some of the latecomer states had male labor forces that desired to 
reduce female employment, but the evidence is not strong. 21 

Focus now on a different equation in table 3, one in which the 
dependent variable is the percentage of sales (not clerical) labor force 
that is female in 1920 (PSF20), as in column 5. Retail trade was also 
covered by maximum hours legislation, and in some states mercantile 
establishments were covered before manufacturing firms. An equa¬ 
tion similar to that for the manufacturing sector was estimated for the 
sales sector with results that are very different. The female share of 
sales employment actually increased in states having more restrictive 
hours legislation. 22 It. should be emphasized that the decline in the 
female share of the manufacturing labor force during the 1900-1920 
period did not mark a new trend. 23 The female share of the manufac¬ 
turing labor force had been declining for decades preceding max¬ 
imum hours legislation, and the share probably peaked as early as the 
1840s (see Goldin and Sokoloff 1982). Female employment in the 
sales sector, however, began to increase during the first decades of 
this century. The coefficient on WK.REST suggests that maximum 
hours legislation, by reducing daily hours in the sales sector, may have 
increased the employment share of females. Thus while the restric¬ 
tiveness of maximum hours legislation may have had little or no effect 
on female employment in manufacturing, it may have had a positive 
effect on female employment in sales. 

The variables used thus far to measure the employment effect are 
the shares of women in a particular sector. The female labor force 
participation rate could also have been altered by hours legislation. 
Under the gainful worker definition, participation was probably re¬ 
lated to the average number of days an individual worked (see Goldin 
1987). Thus decreasing hours of employment could have increased 


21 Baker ([1925] 1968, chap. 6) contains several examples of industries in which male 
workers probably benefited from maximum hours legislation. My claim here is that, on 
average, female employment did not decline with the restrictiveness of the legislation 
but that female employment in certain industries and in certain states may have been 
adversely affected. 

22 Note that the restrictiveness variable was computed for the manufacturing labor 
force because scheduled hours for mercantile establishments were not available. The 
Women’s Bureau bulletins cited in table 2 indicate, however, that there was relative 
homogeneity within states across manufacturing and sales hours schedules. Note, as 
well, that the estimation includes the percentage of sales workers who were female in 
1910 rather than the percentage in 1900, as in the manufacturing estimations. While it 
would have been more appropriate to use 1900, only 1910 data were available for sales 
workers. 

** The decline is evident only in the population census data. The manufacturing 
census data show an increase in the female share. 
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days worked and the labor force participation rate. To assess the 
proposition, a relationship was estimated across large urban areas 
(cities having over 100,000 persons) in various states between the 
labor force participation rate of a group of women, for example, 
native-born white married women, and mean scheduled hours of 
work in manufacturing. 

Constant elasticity equations were estimated for white women of 
native-born, foreign-born, and native-born of foreign-born parents, 
separately for married women and all marital statuses. The coeffi¬ 
cients on hours per day in all equations are negative and substantial, 
indicating that shorter hours were associated with higher participa¬ 
tion rates. The result holds across all subgroups of white women, but 
it is strongest, by nativity, for married women. 2 ' 1 The generally larger 
elasticities for married women indicate that days worked per year 
were more responsive to scheduled hours per day for those with 
greater home responsibilities. This result is similar to that in King 
(1978), who found that labor force participation among married 
women with children in 1970 was higher in cities in which men 
worked fewer hours. 


III. Summary Comments 

The causes and consequences of maximum hours legislation have 
been explored and found to differ f rom the interpretation presented 
by Landes. In particular, hours declined for men as well as for women 
in slates with hours legislation, and the employment share of women 
in manufacturing did not decrease with the restrictiveness of the 


24 The elasticities of labor force participation with respect to mean scheduled hours 
of work in manufacturing in 1920 were computed using only one additional indepen¬ 
dent variable, the southern dummy (SD). At present insufficient wage data are available 
to include in the estimation. (Dependent variable is the log labor force participation 
rate.) 


NN-A 

log HRS 19 

-.719 


(1.18) 

SD 

-.100 


(1.48) 

Constant 

6.31 

« 2 

(2.63) 

.23 

Mean 

32.0 



NN = native-born, native parentage; NF = native-born, foreign parentage; FB 
= foreign-born; A = all marital statuses; M = married. Equations were estimated 
across the large urban areas of 31 states. Absolute values of f-statistics arc in parenthe¬ 
ses. (Sources; See table 1 and Hill 1929.) 
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legislation. Indeed, the employment share of women in another cov¬ 
ered sector, sales, rose with increasing restrictiveness, and female 
labor force participation rates were positively correlated with shorter 
hours. 25 

This work has raised further questions about hours legislation and 
the long-term decline in the workday and workweek in America. I 
have suggested the reasons for the relationship between the decline in 
hours worked by men and legislation protecting women, but it is still 
not clear what precise mechanisms operated to reduce hours of work 
for all. 
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I. Introduction 

Many observers of gambling markets are familiar with pamphlets and 
books—backed by considerable statistical “evidence”—that for a small 
price provide the reader with one or more “profitable” belting sys¬ 
tems. In a recent paper, Zuber, Gandar, and Bowers (1985) presented 
such a system that they claim represents “true exploitations of 
inefficiencies in the NFL gambling market” (p. 805). In addition to 
this “stronger test” of efficiency, Zuber et al. presented a “weak test,” 
in which they were unable to reject the hypothesis that the scores of 
National Football League games are unrelated to the predictions given 
by the Las Vegas gambling line. 

We are grateful to Tom Goodwin, Levis Kochin, Charles Nelson, and Frank Wolak 
for their comments. We also wish to acknowledge the contribution of Rick Zuber and 
John Gandar, who checked our data and provided helpful criticism of an earlier draft. 
Remaining errors are, of course, our own. 
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In this comment, we argue that the tests performed by Zuber et al. 
are misleading, and we present a variety of evidence that contradicts 
their assertion that the betting market for NFL games is inefficient. 
Specifically, a more powerful specification of the weak test decisively 
rejects the hypothesis that scores are unrelated to the Vegas line. In 
addition, we show that the predictors of net efforts used by Zuber et 
al. in betting simulations add essentially no information to that al¬ 
ready incorporated in the Las Vegas line. This result is substantiated 
when the betting system proposed by them is extended out of sample 
in an unsuccessful attempt to produce profits. 


II. Weak Tests of Market Efficiency 

The initial efficiency test examined by Zuber et al. required estima¬ 
tion of the equation 

PS„ = b 0 + bi x VL, t + u„, (1) 

where PSu denotes the actual point differential between teams playing 
the tth game in week /, VL U is the Las Vegas gambling line, and u„ is 
the error term. The efficient markets hypothesis (EMH) asserts that 
VL„ is an optimal, unbiased predictor of PS„. A test of the EMH is 
thus a joint test of the null hypothesis that b 0 = 0 and 6, = 1. An 
“extreme alternative” test proposed by Zuber et al. is the joint test that 
to = tj = 0, that is, that the actual point differential is unrelated to 
the Vegas line. They estimated equation (1) for each of the 16 weeks 
of the 1983 season and could not reject the null hypotheses for 13 and 
15 of the 16 weeks, respectively. They concluded that an alternative 
testing strategy is required. 

We agree. In particular, there is a stronger “weak" test. In the 
absence of systematic differences in the PS-VL relationship between 
different weeks, there is no a priori reason (and no justification of¬ 
fered by the authors) to base efficiency tests on weekly samples of 14 
games each, given data for the entire 16-week season. It is well known 
that the variance of the least-squares estimator is inversely related to 
sample size (see, e.g., Judge et al. 1982, pp. 263-66). By splitting the 
season into 16 parts, Zuber et al. effectively increased the sample 
variance of their estimators, thus making it more likely that they 
would be unable to reject the hypothesis that the coefficient of the 
Vegas line is zero. 1 Splitting one large sample into many small samples 
makes for a less powerful test. 


1 This problem is particularly acute given the unpredictable nature of NFL games. 
I he standard deviation of the difference in scores is 3.fl times that of the Vegas line for 
1983. The likely reason for this is that large differences in scores are difficult to predict. 



208 


JOURNAL OF POLITICAL ECONOMY 


TABLE I 


OLS Regression of Equation (1): 1983 Regular NFL Season 
(Dependent Variable: Home-Away Score Differential) 


Coefficient 

Estimate 

(-Statistic 

Intercept (b n ) 

-.554 

-.345 

Vegas line (5i) 

.918 

4.579 

ft* 

.086 


Degrees of freedom 

222 



Increasing the sample size from 14 to 224 yields a different picture. 
In table I we present ordinary least squares (OLS) estimates of equa¬ 
tion (1) using the full sample of data for the 1983 season. One cannot 
reject the null hypothesis that bo — 0 and bi = 1 at the 95 percent level 
(the calculated /''-statistic is 0.25, well below the critical value of 3.0). 2 
As did Zuber et ah, we are unable to reject the EMH. Contrary to 
their results, the “extreme alternative” hypothesis that the score dif¬ 
ferences and the Vegas line are unrelated (bo ~ b i = 0) is decisively 
rejected when data for the full season are simultaneously utilized (F 
= 12 . 2). 3 

III. Information Provided by the Zuber et al. 

Variables 

Zuber et al. presented regression results that demonstrate that the 
actual home-away point differential is highly correlated with the con¬ 
temporaneous difference in the efforts of the two teams on that day. 4 


Indeed, 34 games in the 1983 season were decided by a three-touchdown margin or 
more (21 points), but there was no instance in which the Vegas line exceeded 15.5 
points. The average difference between the predicted and actual scores for these out¬ 
liers is 25.7 points. Furthermore, the favorite lost in nine of these games. The effect of 
one or two of these so-called blowouts in a sample size of 14 is likely to be severe. 

2 It is worth noting here that, under the EMH, the 224 observations in the sample are 
presumed to be independent (otherwise bettors could make use of the dependence to 
make profits, contradicting the EMH). Without knowing if the EMH is true, one cannot 
be sure that this is so. For example, it could be that the Vegas line adjusts slowly to 
superior or inferior performance,, in which case this week’s PS, VL pair would be 
partially dependent on last week’s pair. But the effect on eq. (1) would be to bias 
standard errors downward, making more likely a rejection of the hypothesis that b 0 = 0 
and b t - l. Hence this consideration strengthens the result that we are unable to reject 
the EMH. We are grateful to Sam Peltzman for bringing this point to our attention. 

* Zuber et al. were aware of the full-season results but included only the weekly 
results in the published version of their paper. We conducted tests for 1984 also, with 
similar results (the F-statistics are 2.46 and 40.82, respectively). 

4 The variables used as proxies for "team efforts” were yards passed, yards rushed, 
fumbles lost, interceptions, penalties incurred, percentage of passing plays, and two 
indexes of team strength: number of rookies and number of previous wins. 
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TABLE 2 

Contemporaneous Model of Zuber et al. 
(Dependent Variable: Home-Away Score Differential) 


Explanatory 

Variables 

1983 NFL Season 
(Zuber et al.) 

1984 NFL Season 
(Sauer et al.) 

Coefficient 

Estimate 

/-Statistic 

Coefficient 

Estimate 

/-Statistic 

Intercept 

1.547 

2.66 

1.548 

2.40 

Rushing yards 

.047 

4.70 

.025 

1.99 

Passing yards 

.044 

6.29 

.046 

6.67 

Previous wins 

.697 

2.32 

.539 

3.47 

Fumbles lost 

-2.299 

-5.50 

-2.671 

-5.15 

Interceptions 

-2.619 

-7.61 

-2.654 

-6.27 

Penalties 

-.424 

-2.48 

-.188 

-1.07 

Percentage passing 





to total plays 

-.217 

-4.34 

-.262 

-4.10 

Rookies 

-.319 

-2.45 

-.607 

-3.71 

R* 

.733 


.805 


Degrees of freedom 

103 


103 



Socmcc.—Data on ail variable* except rookies were obtained from box score summaries primed in the Los AngeUi 
Ttmts. Data for the number of rookies per team were obtained from the office of the National Football League. 


Specifically, they estimated the OLS equation 

PS U = 0 • (X?, - X v u) + (2) 

where X*, is the vector of team efforts for the home team during game 
t in week t, X", is the corresponding vector for the visiting team, h and 
v are indexes identifying the home and visiting teams, and P is the 
coefficient vector. The summary statistics in table 2 confirm that, for 
1984 as well as 1983, equation (2) ’’explains" a large percentage of the 
variance of score differentials. 

It is another matter to predict the efforts of the teams. In their 
betting simulation, Zuber et al. used as predictors of future effort per 
game averages of their effort variables, calculated from previous 
weeks. But do these predictors add any information to that already 
provided by the publication of the Vegas line in daily newspapers? 
The EMH implies that market forecasts such as the Vegas line fully 
reflect all relevant information. Hence, including the Zuber et al. 
predictors along with the Vegas line in a regression equation in which 
the dependent variable is the score differential would not, under the 
EMH, improve on a forecast based solely on the Vegas line. 

We address this question by employing the OLS equation 

PS lt - VL U = P • (** - *&) + w u , 


( 3 ) 
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TABLE 3 

Information Added by the Zuber et ae. Variables: 1983 
(Dependent Variable: Score Differential Minus Vegas Line) 


Coefficient 


Intercept 

Rushing yards 

Passing yards 

Wins 

Fumbles 

Interceptions 

Penalties 

Pass percentage 

Rookies 

R 1 

R * 

Sum of squared residuals 
I'ota) sum of squares 
Degrees of freedom 


Estimate (-Statistic 


-.217 

-.158 

-.077 

-1.192 

.027 

.772 

.781 

1.331 

2.544 

.571 

3.715 

1.102 

.011 

.013 

- .224 

-.668 

.282 

.714 

.051 


-.023 


21.513 


22,674 


101 



Note.— The* results are (mi the final 8 weeks of the 1983 season. Similar results fur 
several subsets of the 1983 and 1984 seasons are available from the authors on request. 


in which the difference between the home-away point differential and 
the Vegas line is regressed on the Zuber et al. predictors: 4* = 1/(1 - 
1) • with similarly defined. 5 The equation is estimated for 

the second half of the 1983 season, the same period during which the 
authors’ system showed a positive profit. Summary statistics are pre¬ 
sented in table 3. One cannot reject the null hypothesis (EMH) that 
the coefficients of their variables are all jointly equal to zero (the/'-test 
statistic for the hypothesis that Pi = P 2 = • • . = P« = 0 is 0.55, well 
below the critical value of 1.91 at the 95 percent level). 6 Furthermore, 
the regression “explains” only 34 points of the score differential that 
are not accounted for by the Vegas line. 7 This amounts to a difference 
of 0.35 points per game, which seems difficult to exploit. We conclude 
that the Zuber et al. variables add virtually no information to that 
already imbedded in the Vegas line. 


5 This procedure is equivalent to regressing the point differential on the Zuber et al. 
variables and the Vegas line, with the restriction that the coef ficient on the Vegas line 
equal 1.0. See the Appendix for estimation results in which the coefficient of the Vegas 
line is unrestricted. 

6 Note that this test uses an information set restricted to the authors' variables; hence 
more powerful tests of the EMH that use additional relevant information could be 
constructed. The purpose here, however, is to focus on the forecasting power of their 
variables. Note also that, by the argument in n. 2, the (-statistics for the Zuber et al. 
variables in table 3 would have an upward bias. 

7 This figure is the square root of (22,675 - 21,514) from table 3. For 1984, the 
figure is 39 points and the E-statistic for the test of the EMH is 0,94. 



COMMENT 


211 


IV. Betting Simulations for 1984 

The argument of Zuber et al. that the market for NFL betting is 
inefficient rested on the demonstration that, for 1983, positive profits 
could have been earned by employing a particular belting strategy. 
The evidence presented above suggests that their result may be 
spurious. 

The authors’ method uses sequential reestimations of the contem¬ 
poraneous explanatory model over the last half of the season to pro¬ 
vide parameter estimates, which are then applied to the predicted net 
efforts of the two teams. This procedure yields the prediction 

fan = • (££ - jfe). t = 9, 10.16, (4) 

where the vector f), is obtained by estimating equation (2) with data 
through week t - 1, and X* and X", are as previously defined. 

Next, a filter test is employed, indicating that bets are to be placed 
whenever - VL U \ > 8, where 8 is an arbitrary filter value. Once it 
is determined that a bet should be placed, the sign of (P$ it — VJL„) 
indicates the team to bet. A positive sign implies a bet on the home 
team; a negative sign indicates a bet on the visitor. Winning bets are 
those for which (PS,/ - VL„) and (fs„ - VI.,,) are of the same sign. 

The procedure was undertaken for 8 = 0.5, 1, 2, and 3. Table 4 
presents the 1984 results along with those for 1983 using 8 - 0.5. 
Only 39 of 101 bets were winners in 1984. The winning percentage of 
38.6 percent is far less than the 52.4 percent needed to break even 
(see Zuber et al. 1985, p. 804), resulting in net losses of $2,920. 8 This 
occurs despite the fact that the explanatory power of the contem¬ 
poraneous Zuber et al. model for 1984 exceeds that for 1983 ( R~ of 
.805 vs. .733). We conclude that the authors have not constructed a 
model that can consistently produce profitable gambling opportuni¬ 
ties. 


V. Conclusion 

Zuber et al. qualified their claim of uncovering a speculative ineffi¬ 
ciency in the market for football betting. The evidence in this note 
amplifies their qualification. Hypothesis tests for the 1983 and 1984 
seasons are uniformly consistent with the efficient markets hypothesis 
and reject the extreme alternative hypothesis of Zuber et al. Further¬ 
more, for the purpose of predicting game outcomes, the net effort 
variables used in their model add no information to that already 
incorporated in the Las Vegas gambling line. The method for exploit- 

* Simulations using 6 = 1.0, 2.0, and 3.0 yielded returns of - $2,510, -$1,600. and 
-$ 1,520, respectively. ” 
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TABLE 4 

A. Gambling Simuiations: Final 8 Weeks of 1984 Season 



Number 

Number Wins/ 

Sum 

Net 

Rate of 


of 

of 

Bets 

Gambled 

Return 

Return 

Week 

Bets 

Wins 

(%) 

<$) 

(!) 

(%> 

9 

12 

5 

41.7 

1,320 

-270 

-20.5 

10 

14 

4 

28.6 


-700 

-45.5 

11 

12 

6 



-60 

-4.5 

12 

14 

5 

35.7 

1,540 

-490 

-31.8 

13 

11 

4 

36.4 

1.210 

-370 

-30.6 

14 

IS 

4 

30.8 

1,430 

-590 

-41.3 

15 

12 

5 

41.7 

1,320 

-270 

-20.5 

16 

13 

6 

46.2 

1,430 

- 170 

-11.9 

B. Cumulative Net Dollar Returns 



1983 NFL Season 


1984 NFL Season 

Week 


(Zuber et al.) 


(Sauer et al.) 

9 



140 



-270 

10 



390 



-970 

11 



640 



-1,030 

12 



470 



-1,520 

13 



90 



-1,890 

15 



1.340 



-2,750 

16 



1,380 



-2.920 


ing the supposed inefficiency claimed by the authors is shown to incur 
substantial losses when extended out of the sample on which it is 
based. Their paper thus fails to provide sufficient evidence to support 
the argument that speculative inefficiencies exist in the betting mar¬ 
ket for NFL games. 
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Appendix 


TABLE A1 

Unconstrained Regressions for Information Added bv the 
Zuber et al. Variables: 1983 (Dependent Variable: 
Home-Away Score Differential) 


Coefficient Estimate ^-Statistic 


First Regression: Zuber ct al. 
Variables Plus Vegas Line 


Intercept 

-2.331 

1.165 

Rushing yards 

-.087 

1.351 

Passing yards 

.004 

.097 

Fumbles 

1.320 

.292 

Interceptions 

2.904 

.854 

Penalties 

.254 

.298 

Pass percentage 

-.210 

-.629 

Wins 

-.491 

-.465 

Rookies 

.473 

1.141 

Vegas line 

1.865 

3.111 

ft* 

.324 


Sum of squared errors 

21,082 


Total sum of squares 

31,186 


Degrees of freedom 

102 



Second Regression: 


Vegas Line Only 


Vegas line 

1.290 

6.962 

ft* 

.304 


Sum of squared errors 

22,185 


Degrees of freedom 

111 
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Book Review 


New Developments in the Analysis oj Market Structure. Edited by Joseph E. Stic;- 
UT7. and C. Frank Mathkwson. 

Cambridge, Mass.: MIT Press, 1986. Pp. 672. $13.50. 


This volume contains the proceedings of a conference of the same name that 
was sponsored by the International Economic Association and held in May 
1982. The put pose of the conf erence was to survey the advances made in 
theoretical industrial organization over the previous decade and to identify 
desirable directions for future work. It is instructive to compare the common 
themes that underlie the current papers with those of a previous conference 
sponsored by the National Bureau of Economic Research, whose proceedings 
were published in 1972 (Fuchs 1972). The objectives of the two events were 
the same, but the earlier conference was smaller in scope. There were four 
contributors compared to 17 in the current volume. This undoubtedly re¬ 
flects the increasing interest of economists in the problems of industrial or¬ 
ganization, or at least in the theoretical tools that may prove useful in explor¬ 
ing them. Needless to say, the current collection of papers represents a more 
sophisticated mathematical approach than was typical 10 years before. In¬ 
deed, mathematical economics has in large part shifted focus from general 
equilibrium theory and the analysis of perfectly competitive economies to the 
analysis of strategic behavior and imperfect information. These issues are 
addressed in virtually all the current papers. However, this is by no means a 
volume for technicians. Throughout, the emphasis is on the important eco¬ 
nomic problems of market structure. While in some cases, notably the paper 
by d’Aspremont and Gabszewicz, the tools are of interest in themselves, the 
analysis is rarely more technical than is warranted by the problem of interest. 

Two broad themes arc present in the current work. First, how do small 
numbers of firms compete with one another? In the 1972 volume this ques¬ 
tion was referred to as the “oligopoly problem,” and by general agreement a 
“theory of oligopoly” did not then exist and was unlikely to emerge. That is, 
there were many theories and no ground rules that could suggest a meaning¬ 
ful choice among them. Few at the time would have predicted the surge of 
interest in strategic behavior through the study of normal form and extensive 
form games or the application of these techniques to interesting problems 
such as limit pricing and entry deterrence. Of course, today there is still no 
“theory of oligopoly,” and the search for one is not high on the agenda of 
most game theorists. However, there is a much richer set of tools—that is, 
equilibrium concepts—and some understanding of when to use them. Many 
of the interesting unresolved questions in game theory, as in oligopoly theory, 
involve games with imperfect information. Much work has been done in this 
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area since 1982, and it will be surprising if some of this work, is not reflected in 
the next roundtable conference on industrial organization. 

A second general theme of the current volume concerns the nature of the 
firm. What do firms do? What are the optimal scale and scope of a firm? How 
do firms innovate or adapt to changing economic conditions? This question 
was very much anticipated in the 1972 conference since Ronald Coase and 
Oliver Williamson were both participants. However, at the time, their mes¬ 
sage was primarily a voice in the wilderness. In contrast, over half of the 17 
papers of the current volume are concerned in some way with these issues. 
Rather than viewing the firm as a profit-maximizing machine in a world of 
unchanging certainty, these papers consider alternatives to profit-maximizing 
behavior, incentives for technical innovation, and the nature of vertical con¬ 
tracts. As in the modeling of competitive behavior, informational issues are at 
the heart of the problem. 

The papers in the volume are divided into seven sections. All but three are 
surveys, representing either a fairly broad overview of the recent literature or 
a synthesis of the authors’ own work- 

in part 1, Archibald, Eaton, and Lipsey survey an approach to monopolistic 
competition theory in which goods are represented by their address (i.e., 
coordinates) in an underlying characteristics space. The set of potentially 
producible goods is therefore a continuum. In contrast to the Chamberlinian 
model, in which free entry forces profits to zero, positive profit equilibria are 
possible in the address models. Essentially this occurs because entering firms 
must locate among already established firms and in general cannot expect a 
market of equal size. Consequently, strategic behavior by incumbent firms 
and potential entrants is to be expected, and market structure will depend on 
factors such as the degree of product-specific fixed costs, specificity of capital 
to location, and so forth. This paper surveys a broad literature at a rapid but 
intelligible pace. The authors also suggest a number of worthy topics for 
future research. 

Part 2 consists of three papers on forms of competition. Encaoua, Geroski, 
and Jacquemin consider the eff ects of strategic behavior on market structure 
when there is either an exogenously given first-mover advantage or a form of 
imperfect information. In the former case the authors conclude in a survey of 
a number of different models that limit pricing is unlikely to be a credible 
entry-deterring strategy but that strategic investments in capacity, advertis¬ 
ing, location, or similar activities are likely to be credible. When the exoge¬ 
nous first-mover advantage is removed, these activities no longer appear 
credible. However, for markets with an informational asymmetry, the entry 
deterrence may be attained by investment in reputation. Unfortunately, 
many of the results of this paper rely on game-theoretic reasoning that is 
incompletely developed both because the presentation is hurried and some¬ 
what confused and because the field itself is rapidly changing. 

Gilbert considers preemptive competition by firms in which “the timing of 
actions is important." While these results are comparable with those obtained 
in the previous paper, the analysis is more economic than game theoretic. 
Gilbert considers two basic questions. First, is entry deterrence desirable if it is 
feasible? Second, when is deterrence feasible or credible, assuming it is 
desired? Generally speaking, entry deterrence is desirable if there are suffi¬ 
ciently many potential entrants. However, if entrants expect Cournot compe¬ 
tition after entry occurs, deterrence is found to be credible only in the pres¬ 
ence of substantial sunk costs and highly elastic demand. In a dynamic model 
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of capacity expansion it was also shown that conditions for credible entry 
deterrence are stringent. 

Selten presents a model of organizational slack based on the standard 
Cournot oligopoly model. Each firm maximizes profits on the basis of its 
current marginal costs including the cost of slack. Under a “strong slack 
hypothesis” there is a tendency for slack to increase whenever profits are 
positive. Therefore, profits are eventually driven to zero regardless of the 
number of firms in the market. New entrants starting with zero slack will 
choose to enter if they can be accommodated in the market with nonnegative 
profits in the long run. The primary result of this model is that new entry 
always increases aggregate welfare in a world with slack, whereas entry re¬ 
strictions may increase welfare in the standard model if the welfare improve¬ 
ments due to lower prices are outweighed by higher aggregate fixed costs. In 
the model with slack, entry reduces both prices and marginal costs, and the 
effect is shown to always outweigh the increase in fixed costs. 

Three papers in part 3 address the issues of vertical integration and vertical 
restraints. In the first, Williamson presents a short survey of transactions cost 
economics, with applications to incentives for vertical integration, franchise 
bidding as a substitute for traditional regulation, and nonstandard contract¬ 
ing as a remedy for opportunistic behavior when investment in specific assets 
is efficient. 

Green presents a model of vertical integration based on indexible prices in 
the market for an intermediate good. The net external demand for the inter¬ 
mediate good is assumed to be stochastic, and upstream or downstream firms 
in the industry are assumed to be subject to rationing when aggregate de¬ 
mand is unequal to aggregate supply. Integrated firms have an assured mar¬ 
ket, but integration comes at a cost of reduced efficiency. In this model, 
integration of two firms reduces the expected profits of the remaining nonin- 
tegrated firms. This negative externality tends to make equilibria with both 
integrated and nonintegrated firms unstable. Thus totally integrated or to¬ 
tally nonintegrated market structures are more likely to be observed. 

Mathewson and Winter examine the welfare implications of vertical re¬ 
straints such as resale price maintenance, quantity forcing, franchise fees, and 
closed territory distributions. A vertical restraint may serve to enhance profits 
at both levels of production. In such cases the authors determine, in the case 
of specific functional forms, when vertical restraints can also improve aggre¬ 
gate welfare. 

Part 4 consists of two papers that deal with collusion and oligopoly. In the 
first, d’Aspremont and Gabszewicz address the stability of collusion in an 
abstract game. Given a finite number of player types, a syndicate collusive 
scheme describes the set of players of each type who play the game as a 
syndicate, with the remaining players of each type playing as singletons. A 
payoff correspondence, which may be based on the core or any other solution 
concept, assigns a set of outcomes to each collusive scheme. Then a collusive 
scheme is said to be internally stable if no syndicate member prefers to be 
unorganized, and at least one syndicate member is strictly better off. External 
stability is expressed by the comparison of a given collusive scheme with other 
possible schemes in an appropriate class. A strong sense of external stability 
allows a comparison with all possible schemes, while a restricted sense might 
allow comparison only with the scheme representing total disagreement. Ex¬ 
ternal stability is defined either as unanimous stability, in which all possible 
alternatives within the class are allowed, or as individual stability, in which the 
only alternatives are those that result from the move of a single player. 
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Two illustrative examples are given. One concerns a single-output, con¬ 
stant-returns economy with labor and capital as inputs. Here it is shown that a 
syndicate collusive scheme cannot be internally stable if neither factor is to¬ 
tally organized. Furthermore, in large economies, no collusive scheme is 
unanimously stable relative to the disagreement outcome. The second ex¬ 
ample consists of a dominant cartel and a competitive fringe of firms, all with 
identical cost functions. In this example, it is straightforward to demonstrate 
that in any collusive structure, unorganized firms do better than cartel mem¬ 
bers; the profit of any firm is increasing in the size of the cartel; and for any 
collusive scheme in which the dominant cartel has two or more members, all 
firms receive higher than competitive profits. Thus, while it is clear that no 
collusive scheme can be internally stable, it does not necessarily pay for a 
cartel member to defect since doing so reduces the profit to the unorganized 
firms. Furthermore, there is at least one scheme that is individually stable in 
the class of all collusive schemes. 

Also in part 4, Salop considers contractual arrangements that sellers might 
use in order to facilitate oligopoly coordination. The setting is an ordinary 
“prisoner’s dilemma” in which two firms compete in prices, either in a single 
period or in a repeated game. One such arrangement is the “most favored 
nation” clause according to which a seller commits to refund to a buyer the 
difference between that buyer’s price and any lower price for comparable 
goods that the seller offers to a different buyer. Another is the “meeting the 
competition” clause in which a seller commits, by contract or in advertising, to 
match any other seller’s lower price. While these agreements are superficially 
“competitive" in appearance, it is dear that they can have the effect of 
facilitating collusion by penalizing price cutting and therefore altering the 
payoff matrix in the underlying game. However, the analysis does not explain 
the circumstances under which a firm would choose to include such clauses in 
its contracts, nor does it direcdy model the strategic behavior of buyers in the 
game. 

Two papers in part 5 are concerned with market structure and incentive 
issues in centrally planned economies. In the first, Horvat surveys recent 
models of the worker-managed firm. If worker-managers choose to maximize 
residual income per worker, it is easily seen that output is inversely related to 
price and positively related to the level of fixed costs. However, these perverse 
effects may be mitigated if alternative models of worker incentives are consid¬ 
ered. For example, if workers maintain an aspiration wage that is fixed dur¬ 
ing the planning period, the worker-managed firm behaves just as its profit- 
maximizing counterpart. In the second paper of this section, Roman discusses 
the nature of competition and the implications for market structure in cen¬ 
trally planned economies. He suggests that the high concentration levels typi¬ 
cally observed result from the overestimation of technological scale econo¬ 
mies. Also a factor is the central planner's ability to coordinate a concentrated 
industry more successfully than an unconcentrated one. There is competition 
among firms in centrally planned economies since planners use rank-order 
evaluation of projects in order to allocate resources. It is less clear that there is 
dynamic competition, however, due to the absence of bankruptcy-induced 
exit. 

Part 6 consists of a survey of the theory of contestable markets by Baumol, 
Panzar, and Willig. As is generally well known, a perfectly contestable market 
equilibrium is one in which no potential entrant could anticipate positive 
profits on entry, assuming that incumbent firms do not respond to entry with 
an immediate price cut. This exposition defines all the appropriate concepts 
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and presents most of the important results for single-output and multiple- 
output markets. One section relates imperfect contestability to the level of 
sunk costs in the underlying technology. 

Part 7, the final section, contains five papers on special issues in competition 
and market structure. Schmalensee surveys three aspects of the role of adver¬ 
tising in industry: advertising as nonprice competition, advertising and effi¬ 
ciency, and advertising and market structure. Among the questions receiving 
the most attention are the following: I)o markets with competition through 
advertising have higher profits in equilibrium than markets with price compe¬ 
tition? Can advertising that is purely informative reduce efficiency? Can 
purely persuasive advertising increase efficiency? Can advertising be used to 
create a barrier to entry? 

The remaining papers deal in one way or another with innovation and 
technological change. Stiglitz emphasizes the relationship between market 
structure and the reward structures for innovative activity. The paper criti¬ 
cizes traditional analysis that is partial equilibrium and that ignores the risks 
associated with innovation. In a general equilibrium model in which the gov¬ 
ernment cannot break up a monopoly but can use various policies to subsidize 
K & 0 activities, it is shown that the monopoly level of research may be either 
greater than or less than optimal. In a model in which the results of research 
are uncertain and there is competition in R & 0 as well as in the prt»duct 
market, competition might reduce the level of innovation by increasing risk 
or leading to excessive correlation of research activity. Competitive markets 
seem to perform better when account is taken of managerial incentives. It is 
known that rank-order evaluation schemes have desirable incentive proper¬ 
ties, and this is exactly the way in which a competitive market with a patent 
system rewards research activity. 

Nelson surveys work with Winter on evolutionary models of economic 
change. Evolutionary models are explicitly dynamic in character. In each 
period firms are posited to have fixed decision rules. Firms also search for 
new rules that can be used in the next period. In response to an external 
shock, firms can adapt both their current behavior under fixed rules and their 
search behavior. As a result of search, successful firms will gain in market 
share. This form of modeling generally relies on computer simulation rather 
than analytic solution. Nelson describes how evolutionary models can be ap¬ 
plied to a one-time shock, to long-run growth with technological change, and 
to the dynamics of Schumpeterian competition. 

Spence considers the relationship between cost-reducing R & D and market 
structure. The approach is related to that of Stiglitz, but the viewpoint is 
normative and issues of risk and managerial incentives are not addressed. 
The analysis focuses on scale economies in R & D and spillover effects that 
describe the extent to which other firms benefit from a given firm’s research 
activity. Spillover effects tend to decrease an individual firm’s R Sc D activity, 
but they also tend to reduce the aggregate cost of achieving a given level of 
industry cost reduction since, once attained, knowledge can be transmitted at 
nearly zero marginal cost. Thus Spence argues that, with appropriate sub¬ 
sidies, market performance improves as spillovers increase. 

In the final paper of the volume, Dasgupta discusses several models in 
which firms compete using R 8c D activity as a strategic variable. In one model, 
marginal (and average) production costs are directly related to R & D expen¬ 
ditures, which are assumed to be subject to diminishing returns. In this 
model, R & D is treated exactly like an investment in fixed capital that lowers 
unit production costs. The purpose is to compare the Nash equilibrium out- 
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comes with a set of stylized facts regarding technological competition. A 
second class of models considers the issue of preemptive patenting when one 
firm has a first-mover advantage associated with incumbency. In this form of 
“winner-take-all” model, the incumbent and a set of rivals simultaneously 
choose R &; D levels and in effect bid for a successful patent. In the Stackel- 
berg equilibrium the incumbent chooses preemption and the rivals choose 
not to compete. Similar results are obtained if the incumbent must compete 
for multiple patents. 

If one is to criticize this work it must be for its omissions. For example, 
market structure in regulated industries is considered only tangentially, and 
the important question of entry by regulated firms into competitive markets, 
or competitive Arms into regulated markets, is not addressed at all. Although 
the papers are not overly technical in themselves, the selection of topics may 
have been overly influenced by purely technical advances in game theory and 
other fields. None of the speakers used the occasion, as Coase did in 1972, to 
speculate on the important problems that theory has so far overlooked. These 
are, however, minor points. Viewed as a whole, this is an excellent book. 
Although it has some of the defects that are inevitable in a conference vol¬ 
ume, the overall quality of the papers is very high and the papers fit together 
well as a coherent whole. The introduction by Stiglitz provides a useful over¬ 
view of recent research related to work in the volume. Also useful are the 
transcripts of the discussion following each paper. The conference was un¬ 
questionably a success in its attempt to survey past work and identify inter¬ 
esting research questions in industrial organization. This volume is a valuable 
collection of papers that will prove useful to students and scholars at least 
until it is replaced by the proceedings of a future conference that is similar in 
scope. 

William W. Sharkf.y 

Bell Communications Research 
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Money and the Stock Market 


Milton Friedman 

Hoover Institution 


Quarterly data for the period from 1961 to 1986 suggest that the 
real quantity of money (defined as M2) demanded relative to income 
is positively related to the deflated price of equities (Standard and 
Poor’s composite) three quarters earlier and negatively related to the 
contemporaneous real stock price. The positive relation appears to 
reflect a wealth effect; the negative, a substitution effect. The wealth 
effect appears stronger than the substitution effect. The volume of 
transactions has an appreciable effect on M1 velocity but not on M2 
velocity. Annual data for a century suggest that the apparent domi¬ 
nance of the wealth effect is the exception, not the rule. 


I. Introduction 

This note had its origin in a chart covering the past quarter century 
prepared by a financial institution that showed a close inverse relation 
between the level of the Dow Jones stock market index and the veloc¬ 
ity of the monetary aggregate now designated M2 by the Federal 
Reserve System. 

In the extensive work on the demand for money that I and others 
have done, the role of the stock market in affecting velocity has been 
taken into account in either of two ways: first, by treating the volume 
of financial transactions engendered by the market as an argument in 
the demand function on the grounds that such transactions would 
“absorb” money, hence reducing income velocity; 1 second, by taking 

I am indebted for comments on earlier drafts to Albert J. Field, Robert Hetzel, David 
Laidler, Allan H. Meluer, Anna J. Schwartz, and two anonymous referees. 

1 This has been a recurring theme ever since Irving Fisher’s (1911) early emphasis on 
the transactions approach to the quantity theory, and especially during stock market 
booms, from 1929 to the present. For a recent example, see "M1 Revisited” (1986) and 
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the earnings or dividend yield on securities as one of the returns on 
an alternative to money in a portfolio (Hamburger 1966, 1977,1983). 
To oversimplify, the result generally has been a finding that the direc¬ 
tion of effect is indeed as suggested by theory but that the magnitude 
of effect is small. 2 In addition, there has also been considerable inves¬ 
tigation of the reverse direction of influence, of the effect of changes 
in the quantity of money on stock market prices (see, e.g., Reran 
1971; Sprinkel and Genetski 1977, esp. pp. 120-39). 

1 know, however, of no econometric attempt to relate the level of 
stock prices to the demand for money, except indirectly, since the 
value of equity stocks is included in the total of nonhuman wealth, a 
variable that both theory and economic evidence suggest is related to 
the quantity of money demanded. 

The inverse relation between stock prices and monetary velocity (or 
direct relation between stock prices and the level of real cash balances 
per unit of income) can be rationalized in three different ways: (1) A 
rise in stock prices means an increase in nominal wealth and gener- 


Wenninger and Radecki (1986). For an analysis of the 1920s, sec Field (1984). For an 
analysis of post-World War II data, see Cramer (1981, 1986). In demand studies by 
Anna Schwartz and me. we have omitted transactions variables for the reason indicated 
by the following quotation: 

One variable that has traditionally been singled out in considering the de¬ 
mand for money on the part of business enterprises is the volume ol transac¬ 
tions ... per dollar of final products; and, of course, emphasis on transactions 
has been carried over to the ultimate wealth-owning unit as well as to the 
business enterprise. The idea that renders this approach attractive is tliat there 
is a mechanical link between a dollar of payments per unit time and the 
average slock of money required to effect it—a fixed technical coefficient of 
production, as it were. It is clear that this mechanical approach is very differ¬ 
ent in spirit from the one we have been following. On our approach, the 
average amount of money held per dollar of transactions is itself to be re¬ 
garded as a resultant of an economic equilibrating process, not as a physical 
datum. If, for whatever reason, it becomes more expensive to hold money, 
then it is worth devoting resources to effecting money transactions in less 
expensive ways or to reducing the volume oi transactions per dollar of final 
output. In consequence, our ultimate demand function for money in its most 
general form does not contain as a variable the volume of transactions or of 
transactions per dollar of final output; it contains rather those more basic 
technical and cost conditions that affect the costs of conserving money, be it by 
changing the average amount of money held per dollar of transactions per 
unit time or by changing the number of dollars of transactions per dollar of 
final output. This does not, of course, exclude the possibility that, for a partic¬ 
ular problem, it may be useful to regard the transactions variables as given and 
not to dig beneath them and so include the volume of transactions per dollar 
of final output as an explicit variable in a special variant of the demand func¬ 
tion. [Friedman 1956, pp. 12-13] 

* The major exception is Field (1984), who concludes, on the basis of dynamic simula¬ 
tions, that “absent the post-1925 surge in asset exchanges . . . holdings of Ml would 
have been on average 17 percent below their actual levels” (p. 50). 
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idly, given the wider fluctuation in stock prices than in income, also in 
the ratio of wealth to income. The higher wealth to income ratio can 
be expected to be reflected in a higher money to income ratio or a 
lower velocity. (2) A rise in stock prices reflects an increase in the 
expected return from risky assets relative to safe assets. Such a change 
in relative valuation need not be accompanied by a lower degree of 
risk aversion or a greater risk preference. The resulting increase in 
risk could be offset by increasing the weight of relatively safe assets in 
an aggregate portfolio, for example, by reducing the weight of long¬ 
term bonds and increasing the weight of short-term fixed-income 
securities plus money. (3) A rise in stock prices may be taken to imply 
a rise in the dollar volume of financial transactions, increasing the 
quantity of money demanded to facilitate transactions. Offsetting 
these factors is (4) a substitution effect. The higher the real stock 
price, the more attractive are equities as a component of the portfolio. 
The relative strength of the inverse effect of items 1 -3 and the posi¬ 
tive effect of item 4 is an empirical question. 3 

The graph that stimulated these speculations related the nominal 
level of stock prices to the velocity of M2 (calculated as a ratio of 
personal income to M2 in order to get monthly observations). Yet the 
first rationalization—a wealth effect—suggested in the preceding 
paragraph clearly requires that the stock price be measured in real 
rather than nominal terms. Simple correlations using quarterly data 
for 1961:1-1986:4 between M2 velocity on the one hand and real and 
nominal stock prices on the other were consistent with this theoretical 
expectation since they yielded a higher correlation for real than for 
nominal stock prices. In addition, the highest absolute value of the 
correlation was attained when the real stock price was correlated with 


5 An anonymous referee wrote: "There is a fourth story .... as reasonable as the 
three offered .... Stock prices respond to changes in anticipations about future real 
activity. The market responds first because it is a rational information processor, and 
because the price response is close to costless. .. . Debt expands or contracts .... and 
the debt changes initially show up as changes in broad measures of 'money,' like M2. 
Because it is subject to larger adjustment costs, real activity is the last to move." I believe 
that this fourth story is not as reasonable as the three offered. It requires autonomous 
and predictable movements in real income, with the quantity of money responding 
passively to such movements. In a regime in which the monetary authority operates by 
manipulating an interest rate, there no doubt is some such feedback effect of changes 
in real income on the quantity of money. However, I believe that the bulk of the 
evidence contradicts a purely real theory of business fluctuations and supports the view 
that the predominant short-term relation between money and real income is from 
changes in the stock of money to changes in real income rather than from real income 
to (earlier) stock of money. Nonetheless, it could be worth having a more direct test of 
this hypothesis. However, the tests that have occurred to me would require a body of 
data different from those used in this paper and a research effort at least as great as 
that underlying this paper. Hence, I leave that task to others. 
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velocity three quarters later. 4 These encouraging results led me to 
explore further. 

To incorporate the second rationalization—a rise in relative re¬ 
turns from risky assets—requires some measure of the relative attrac¬ 
tiveness to portfolio holders of moderately risky versus less risky as¬ 
sets. As a simple measure, I took the ratio of the yield on long-term to 
the yield on short-term nominal securities. 5 A rise in this ratio reflects 
a shift in demand from long to short securities and hence toward less 
risky nominal assets. Simple correlations showed the log of the ratio to 
be correlated negatively, as implied by the hypothesis, on a synchro¬ 
nous basis with both the logarithm of the real stock price and the 
logarithm of M2 velocity. However, the absolute correlations were 
very low (around .25-.30) and hence somewhat ambiguous. 

Adequate data for testing the third rationalization—the effect of 
the volume of transactions—are available for a shorter period than 
the remaining data: only from 1970 on. It turns out that the volume 
of transactions has no significant effect on M2 velocity but does on M1 
velocity (see App. B). 

One final introductory point: Preliminary investigations of M1 ve¬ 
locity and the velocity of the monetary base gave wholly negative 
results, which explains the concentration on M2 velocity. These re¬ 
sults are consistent with other evidence suggesting that Ml, as cur¬ 
rently defined, is a less satisfactory aggregate forjudging short-term 
changes in monetary holdings than either M2, as currently defined, 
or the base (see Friedman 19866). 


4 The variables used were nominal stock price, Standard and Poor’s composite; real 
stock price, nominal stock price divided by CNP deflator; and velocity, nominal CNP 
divided by M2 two quarters earlier as estimated and seasonally adjusted by the Federal 
Reserve but adjusted also from 1983:1 on to eliminate the effect of the introduction of 
money market mutual deposit accounts (MMDAs) (see App. A). Leading velocity was 
used instead of actual velocity to allow for the tendency of changes in money to precede 
changes in nominal income. A lead of two quarters was used on the basis of earlier 
studies of the length of the mean lead of money. However, in the course of the analysis 
for this article, it turned out that, for the past quarter century alone, a lead of three 
quarters gives somewhat better results. The correlations were calculated from both 
actual values and the logarithms of the values with essentially the same results. 

* The yields used for the period from 1955:1 on were long-term yield, the yield on 
20-year Treasury bonds, and short-term yield, the yield in the secondary market on 5- 
month outstanding Treasury bills (both yields were from the Board of Governors of the 
Federal Reserve System). I experimented also with the ratio of the yield on corporate 
boitds (Moody's all industries) to the yield on 3-raonth banker’s acceptances. However, 
it gave poorer results, so I abandoned it, except for the period prior to 1955, to avoid 
the effects and aftereffects of the Federal Reserve policy of pegging the price of 
government securities. 
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I. M2 Velocity: Quarterly Data 

,'n our earlier work, Anna Schwartz and I concluded that the major 
/ariables affecting the quantity of money demanded are (1) real per 
:apita income; (2) the difference between the yield on Qther nominal 
assets and on money, which we approximated by the product of a 
thort-term rate and the ratio of high-powered money to the money 
stock; (3) the nominal yield on real assets, which we proxied by the 
rate of change of nominal aggregate income; and (4) some dummy 
/ariables that are irrelevant for the post-World War II period, as well 
$ the change in the financial sophistication of the United States prior 
:o World War I, also irrelevant for the present purpose (see Friedman 
ind Schwartz 1982, chap. 6, esp. pp. 259-86). 

With respect to real per capita income, the use of velocity as a 
dependent variable along with the exclusion of real income as an 
independent variable is equivalent to treating the elasticity of demand 
or real balances with respect to real per capita income as unity—not 
:ar from the 1.14 that we had estimated for the century prior to 1975. 
Additional checks supported the conclusion that no violence would be 
done to the postwar relations by using a unit elasticity. 6 

For items 2 and 3, the differential yield on money and the proxy 
return on real assets, the variables that have generally been used in 
monetary demand studies based on quarterly data have been a yield 
in short-term nominal assets as a measure of the yield on an alterna- 
ive to mohey and the rate of change of prices as a measure of the 
nominal yield on real assets. After experimenting with these simpler 
variables, I found that replacing them by the variables we had used in 
>ur Trends study gave decidedly better results: higher R 2 's and higher 
-values for the coefficients. Hence, I included R N , the differential 
/ield on money, and gy, the rate of change of nominal income, to use 
>ur earlier notation, as independent variables. 7 


6 Our earlier estimate was based on averages over half-cycles. The corresponding 
estimate for the underlying annual data (see Sec. HI below) for 1885-1985 is almost 
dentical (1.16), for 1951-85 it is 1.03, and for 1961-85, 0.88. For quarterly data for 
1961:1-1986:4, a multiple regression of the logarithm of nominal GNP on the loga¬ 
rithm of M2 for the current and three prior quarters yields a sum of the coefficients of 
M2 terms of 1.03. Interestingly, the coefficients of the M2 terms for the current and 
wo prior quarters sum to close to zero, and the coefficient for the third prior quarter is 
1.01. Including other variables in this regression yields a sum of coefficients of the 
nonetary terms equal to 0.996. The difference between the coefficient and unity is not 
statistically significant from zero, at any reasonable significance level, for any of the 
postwar regressions. 

7 For R n , I used the product of the yield in the secondary market on 3-month 
Hitstanding Treasury bills (RTBill) (prior to 1955, rate on 3-month banker's acccp- 
anccs), expressed as a decimal (i.e., 2 percent = 0.02), multiplied by the ratio to M2 of 
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One final point before I present some results. In our earlier work 
(1982), Schwartz and I had concluded that a distinct change had 
occurred in the relationship between interest rates and the rate of 
price change around the mid-1960s. Prior to that time, there had 
been little relationship between the two. From the early sixties on, that 
relationship started to get closer and closer and after about 1965 
became extremely close indeed in line with the effect that Fisher 
(1896, 1907) had suggested many decades earlier. It appeared likely 
that there would be a similar break in the relationship between mone¬ 
tary variables in general and stock prices. That possibility was rein¬ 
forced by an examination of the relationship between velocity and 
stock prices. As already noted, for the period from the mid-1960s on, 
real stock prices seemed to lead the velocity of money with a max¬ 
imum simple correlation of - .7 at a lead of three quarters. On the 
other hand, for the period prior to that date, the timing relation is 
reversed, with velocity leading real stock prices (maximum simple 
correlation equals -.7 at a lead of four quarters). Accordingly, I 
chose the period from 1965:1 to 1986:4, the latest quarter for which I 
had data, on which to concentrate, later expanding it to 1961:1— 
1986:4. 

Table 1 presents the final result of experimentation with different 
definitions of variables and different timing relations and documents 
the statistically significant difference between the 1950s and the later 
period. Figure 1 plots, for the whole period from 1947:1 to 1986:4, 


high-powered money (HPM), i.e., the monetary base, as estimated by the Federal 
Reserve without adjustment for reserve requirement changes, but with the adjustment 
of M2 mentioned earlier to allow for the introduction of MMDAs. For the period since 
the fourth quarter of 1981, the Federal Reserve has not seasonally adjusted the base 
without adjustment for reserve requirements. Accordingly, I use throughout the ratio 
of the nonseasonaliy adjusted HPM to nonseasonally adjusted M2. For the period 
before 1959, 1 did not have quarterly data on HPM, so I interpolated HPM from 
annual data. For g v , I used a four-quarter difference between the logarithms of nomi¬ 
nal GNP, i.e., the logarithm of nominal GNP in one quarter minus the corresponding 
logarithm of nominal GNP four quarters earlier. The use of R N raises a problem of 
possible spurious correlation because the money multiplier (M2/HPM) enters into both 
M2 velocity (M2/GNP = [M2/HPM] x [HPM/GNPJ) and R N (R„ = RTBill x [HPM/ 
M2]). This spurious correlation is mitigated by three factors: (1) The money multiplier 
enters velocity for a period two quarters earlier than it enters R N , which was found to 
give best results on a synchronous basis. However, the effect of using a lagged value is 
minor since the serial correlation of the money multiplier with a lag of two quarters is 
.9925. A mote important point is that the money multiplier enters primarily as a trend 
effect, having little effect on quarter-to-quarter movements. [2) The dependent vari¬ 
able is the logarithm of velocity; R N is the original value, not its logarithm. (3) The 
variance of R N is dominated by the variance of RTBill. In a decomposition of the 
variance of log R N , the variance of log RTBill is .207621, that of log(HPM/M2) is 
.0260255, and the cross-correlation term is -. 116284. A more detailed analysis of this 
statistical problem is contained in Friedman and Schwartz (1982, p. 270, n. 46), where it 
is concluded that the spurious statistical effects are not serious for our phase-average 
relations. 



Regression of Logarithm of Leading Velocity of MS on Other Variables for 1951:1-1986:4 and Two Subferiods 

(No Correction for Serial Correlation) 
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the observed logarithm of the velocity of M2 and the values predicted 
by the equation for 1961:1-1986:4 in table 1. It provides perhaps the 
best bird’s-eye summary of the overall results. 

One striking feature is the initial rise in observed velocity until it 
reaches the level predicted from the regression based on much later 
data. The two lines cross between the second and third quarters of 
1951, which, by no coincidence, is two quarters after the famous 
accord was reached between the Federal Reserve and the Treasury 
ending the Federal Reserve’s pegging of interest rates. 8 Prior to that 
accord, rates on short-term highly liquid assets had been kept 
artificially low and stable by Federal Reserve policy, greatly reducing 
their attractiveness as an alternative to holding cash balances. From 
then on, there was something much closer to a free credit market, 
though of course the Fed intervened frequently. The timing of the 
accord explains why I start the detailed regression analysis with data 
for that quarter. 

A second notable feature of figure 1 is the consistency of the rela¬ 
tionship between the observed and predicted velocities throughout 
the period as a whole, even for the period prior to that for which the 
regression was calculated. As noted, there is a statistically significant 
difference between the two periods, yet they are sufficiendy similar 
that from 1951 to 1961 observed velocity moves up and down around 
predicted velocity and displays occasional parallelism. The major dis¬ 
crepancies are (1) decidedly higher actual than predicted velocity in 
1951, 1952, and 1953, almost surely attributable to the effect of the 
Korean War on anticipations of inflation; 9 (2) a reaction during the 
next 2 years; and, then again, (3) decidedly higher actual than pre¬ 
dicted velocity in 1955, 1956, and 1957, a discrepancy for which I 
have no ready explanation. 

A third striking feature is that there is no period in which there is a 
major break in the relationship. In particular, despite all the talk 
about how the relation between money and other variables has shifted 
drastically in recent years, there is no sign of that in the figure. That 
remark should be qualified, however, in one respect. The values for 
M2 used in these calculations do make an adjustment, as described in 
Appendix A, for the introduction of MMDA’s in 1983. 


8 “Two quarter* after" because the denominator for leading velocity for 1951:3 is M2 
for 1951:1. 

9 The Korean War period is the only major inflationary episode that I know of that 
was not preceded by more rapid monetary growth and hence can be regarded as the 
result, initially at least, of an autonomous increase in velocity. 
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A. Wealth and Substitution Effects of the Real Stock 
Price 

To return to table 1, the key result, for our purposes, is that the 
coefficient of the logarithm of real stock prices is negative in all the 
equations and statistically significant for both the period as a whole 
and 1961-86. That was equally true in the many minor variations of 
these equations that I calculated in the course of setding on the final 
variables and timing relations to be used. This is the sign implied by 
the three rationalizations of an inverse relationship offered earlier: a 
wealth effect, a risk-spreading effect, and a transactions effect. How¬ 
ever, the situation is different for the long-short yield ratio. The risk¬ 
spreading relationship implies that this variable should have a nega¬ 
tive coefficient. Indeed, in calculations in which the variable is entered 
with a three-quarter lead, the coefficient is generally negative. How¬ 
ever, it either is not statistically significant or is on the margin of 
significance, and it consistently performs better if entered synchro¬ 
nously. However, on a synchronous basis, the coefficient is positive. 
This result is puzzling. A rise in the yield ratio implies that short-term 
interest rates are expected to rise, which might be a reason for a later 
reduction in the ratio of cash balances to income, that is, a rise in 
velocity, but not for the current effect that the positive coefficient 
implies. 10 Similarly, a rise in the yield ratio is a reflection of a fear of 
future inflation, but any such effect should be allowed for by our 
proxy for the nominal yield on physical assets. 

The low values of the Durbin-Watson statistic for the three equa¬ 
tions in table 1 led two referees of an earlier version of this paper to 
question whether the results would remain valid if allowance were 
made for the indicated strong serial correlation of residuals. Such 
serial correlation does not bias the estimated coefficients but does 
suggest that the effective number of degrees of freedom is less than 
what a mere count of observations would indicate and hence may bias 
the t-values, altering the apparent statistical significance of the re¬ 
sults. 11 To check this possibility, table 2 compares the table 1 equation 


10 Note that the “reduction" referred to is not in nominal monetary balances. They 
refer to the second prior quarter and are thus predetermined. However, current 
spending is not predetermined, so what is implied by the correlation is that a rise in the 
yield ratio encourages holders of cash balances to increase spending by a greater 
amount than they would otherwise regard as appropriate, given their prior cash bal¬ 
ances. 

11 I have mixed reactions to the current widespread tendency to regard serial correla¬ 
tion of residuals as a pure nuisance, if not the original sin, in analyzing time series. 
Serial correlation of residuals does have the effects indicated in the text and, hence, 
does deserve attention. However, some of the means used to attain serially uncor¬ 
related residuals may lead.analysts to throw out the baby with the bath. It is often useful 
to regard time series as a combination of transitory stochastic and more permanent 
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for 1961-86 with two others designed to reduce serial correlation of 
residuals: one that adds the lagged value of the dependent variable, 
the other that uses the Cochrane-Orcutt correction for first-order 
serial correlation. The first device reduces the serial correlation ap¬ 
preciably but does not eliminate it; the second goes much farther in 
that direction. For our purposes, however, the main result is that 
neither alternative regression is inconsistent with the basic conclu¬ 
sions suggested by the initial regression. While the coefficients differ 
somewhat in size among the three regressions, they are all in the same 
ballpark and, with one exception (the coefficient of the long-short 
ratio in the regression containing the lagged dependent variable), all 
retain significance. 12 

If the wealth effect and the substitution effect operate with the 
same reaction speed, there is no way to isolate their separate effects. 
However, it is plausible that the substitution effect operates more 
rapidly than the wealth effect. To test this possibility, I expanded the 
multiple regression for 1961:1-1986:4 in table 1 by including the real 
stock price with a zero lead. The two equations are compared in table 
3, which also contains additional regressions to check the effect of 
serial correlation of residuals. 

The inclusion of the synchronous real stock price raises the correla¬ 
tion and reduces the standard error, though only modestly, for both 
equations B and C. More important, the constant aside, every com¬ 
mon coefficient and its f-value in these equations is increased in ab¬ 
solute value, and the coefficient of the synchronous real stock price 


underlying components and to regard the two components as reflecting two different 
sets of forces; c.g., purely random measurement errors may have a far larger impact on 
the transitory component than on the permanent component. The process of obtaining 
serially uncorrelated residuals may in effect simply eliminate the permanent compo¬ 
nents, leaving the analyst to study the relation among the stochastic components of his 
series, which may be pure noise, when what is of economic interest is the relation 
between the permanent components he has discarded in the process of seeking to 
satisfy mechanical statistical tests. For this reason, plus the problem of interpreting 
statistical tests of significance for a regression that has been chosen from among many 
trials because it yields the “best" result, I have long been skeptical of placing major 
emphasis on purely statistical tests, whether f-values, Durbin-Watson statistics, or any 
others. They are no doubt useful in guiding research, but they cannot be the major 
basis forjudging the economic significance or reliability of the results and cannot be a 
substitute for a thorough examination of the quality of the data used. Personally, I 
prefer to put more emphasis on the consilience of evidence from a number of different 
sources or periods—as, in this paper, the annual as well as quarterly data. And I would 
regard the testing of my conclusions by examining, for example, similar data for other 
countries as more rewarding than a more intensive statistical mining of the quarterly 
series I have used. 

** Cochrane-Orcutt corrections were also applied to the two other regressions in table 
1, with similar results, and to the regression in table 2 that includes the lagged depen¬ 
dent variable, with the result of rendering the coefficient of the lagged dependent value 
not significantly different from zero. 
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final equation, combining data for the United States and the United 
Kingdom for the century 1870-1970, yielded a coefficient of -9.3 
for /?at and -0.47 for g y , with the qualification that these were single¬ 
valued estimates within fairly broad limits (Friedman and Schwartz 
1982, pp. 284—85). That equation was for the quantity of money 
demanded, whereas the equations dealt with here are for velocity. A 
rise in the quantity of money demanded means a decline in velocity. 
Hence, in comparisons of these coefficients with those in tables 1, 2, 
and 3, the signs should be reversed. If we do so, the coefficients are 
remarkably close for R\ for the 1961-86 equations that do not use the 
Cochrane-Orcutt correction for serial correlation. They range from 
9.9 to 12.8, compared with the longer-run estimate of 9.3. 14 The 
Cochrane-Orcutt correction yields lower coefficients, ranging from 
5.7 to 6.6, presumably because the effect of the correction is to elimi¬ 
nate at least part of the longer-term effect of a change in R N . For gy 
the situation is different. The coefficient is decidedly smaller in mag¬ 
nitude though of the correct sign, ranging from 0.14 to 0.38, com¬ 
pared with the longer-term estimate of 0.47. An obvious explanation 
is that the year-to-year percentage change in quarterly nominal in¬ 
come is a less accurate proxy for the nominal yield on physical assets 
for quarterly data than the phase-average rate of change of nominal 
income is for phase-average data. The effect of the greater “noise" in 
this proxy would be to lower the numerical size of the coefficient so 
that this result cannot be regarded as inconsistent with our earlier 
results. 

III. Tests from Annual Data 

The similarity of these results from post-World War II quarterly data 
with our results for the longer period suggests using the annual data 
as an additional source of information on the effect of the real stock 
price. The data used in Monetary Trends ended in 1975. In connection 
with work that I have been doing on the cyclical pattern of money 
demand, I have extended the data forward to 1985 and calculated an 
identical regression using annual rather than phase-average data. 15 

The dependent variable in the demand function for money was the 
logarithm of real money balances per capita (log m); the independent 
variables were the logarithm of real income per capita (log y), the 


14 For this comparison, I have used the long-run coefficients as estimated in the 
equations with lagged dependent variables. 

11 When identical data were not available for the later period, I extrapolated the 
earlier data using the correlation between later and earlier data for an overlapping 
period. This was necessary primarily for the money and income series because of 
subsequent revisions in definitions and statistical estimates. 
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TABLE 4 

Coefficient and Absolute I-Value of Pmob Year’s Logarithm of Real Stock 
Price in Multiple Regressions of Real Per Capita Money on Other Variables; 
Standard Errors of Estimate of Regressions Including and Excluding Stock 
Price; Annual Data, 1886-1985, and Various Subperiods 


Period 

Number of 
Observations 

Logarithm of 

Prior Year’s 

Real Stock Price 

Standard Error 
of Estimate 
of Regression 

Coefficient 

Absolute 
l -Value 

Excluding 
Stock Price 

Including 
Slock Price 

1886-1985 

100 

-.0089 

.5 

.056 

.056 

1886-1939 

54 

-.105 

3.3 

.056 

.051 

1940-85 

46 

- .0485 

2.0 

.038 

.037 

1886-1914 

29 

-.0647 

.8 

.029 

.029 

(919-39 

21 

.0002 

.003 

.041 

.042 

1951-85 

35 

-.0598 

4.5 

.023 

.018 

1961-85 

25 

- .0047 

.4 

.012 

.012 

1951-73 

23 

-.1550 

8.9 

.023 

.010 

1974-85 

12 

.0483 

1.7 

.014 

.012 


Sort.—All equations tontain an other indepetitlent variable* the logarithm ol real per capita income, the dilter- 
cnlial yield on money, and the proxy yield on real arret*. In addition, some equation* include a postwar adjustment 
dummy and 4 shill adjustment dummy 


difference between the nominal yield on money and other nominal 
assets (/fjv), the rate of change of money income as a proxy for the 
nominal yield on physical assets (#>■), two dummy variables to allow for 
postwar adjustments (W) and a significant liquidity shift from 19-29 to 
1954 ( S ), and an adjustment for changing financial sophistication in 
the United States prior to World War I. As is to be expected, an 
equation calculated from annual data agrees very closely with the 
equation calculated on the basis of phase-average data. 16 

Table 4 summarizes the results of adding the logarithm of the real 
stock price as an independent variable to multiple correlations based 
on the annual data for various periods during the century from 1886 
to 1985. Since the dependent variable is real per capita income rather 
than velocity, the wealth effect, risk-spreading effect, and transactions 
effect would all tend to produce a positive coefficient on the real stock 


16 The two equations are: annual data For 1886- 1985: 

log m « -1.55 + 1.16 logy - U.9«. v - 51g, + .0231V + .138S, 
(19.6) (99.3) (6.4) (7.9) (5.6) (7.1) 

SEE = 5.6 percent; 

phase-average data for United States for 1873-1975: 

login - - 1.53 + 1.15 logy - 8.82ft* - .59g> + .025W + .175. 


(9.4) (50.7) 


SEE 


(4.4) (3.5) 

5.1 percent. 


(3.8) (6,9) 
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price, the substitution effect a negative coefficient. With annual d 
we cannot separate the two effects, given that any difference in titr 
is apparently measured in quarters, not years. Accordingly, we 
only estimate the net effect, and even that only crudely. 

The fascinating feature of table 4 is that the coefficients are ■ 
dominantly negative, suggesting that the substitution effect has h 
dominant. Moreover, the only coefficients statistically significant 
.05 level or lower are negative—for 1886-1939, 1940-85, 1951- 
and, most strongly, 1951-73. The positive coefficient for 1919-3 
trivial. The only positive coefficient that approaches statistical sigi 
cance is the one for 1974-85, the final 12 years of the period cove 
by our earlier analysis of quarterly data. 

IV. Reconciliation of Results from Annual 
and Quarterly Data 

These results for the annual data do not contradict our earlier con 
sions for the quarterly data, but they do pul them in a different lij 
They suggest that the recent period has been atypical, that for mos 
our history substitution effects have dominated wealth effects, ; 
that the opposite has prevailed only for the past several decades." 
important—and unanswered—question is whether this is a tern 
rary reversal or a permanent one. 

The reversal is linked in time at least to the emergence of the Fit 
effect as a dominant element in the movement of interest rates. / 
both are linked—not only in time but also as possible cause and 
feet—to the changed monetary regime in the world. As 1 have 1 
phasized elsewhere, the world’s monetary regime since 1971 has 
historical precedent (Friedman 1986a). It is the first time that ev 
major currency in the world has severed all links to a commodity ; 
is on a strictly inconvertible paper or fiat standard, and is so, not; 
temporary expedient in time of crisis, but as a system intended tc 
permanent. The transition was of course not discontinuous, e 
though its final formal inception can be precisely dated as occuri 
when President Nixon dosed the gold window on August 15, 197] 
was a gradual transition that doubtless had its effects long before 
final step. 

Why should this transition have had the suggested effects? 
interest rates, the answer is clear: the end of a commodity link, h 
ever tenuous, meant the end of a long-term anchor to the price 1< 
and hence ushered in a period of increased long-term uncertai 
about future nominal values. The effect of anticipated inflation 
deflation stressed by Fisher became potentially more important,■ 
experience of accelerating inflation in the 1960s and 1970s, and di 
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flation in the 1980s, converted the potentiality into a dear and present 
actuality. Witness the explosion of financial futures markets after 
1971. 

For real stock prices, the answer is less clear. The changed mone¬ 
tary regime enhanced the importance in portfolio choice and, in the 
valuing of portfolios, of real versus nominal assets. However, that 
would seem to strengthen both the substitution and the wealth effect, 
and, indeed, the 1951-73 period shows the strongest net substitution 
effect of any of the subperiods for the annual data. But why, then, 
would the annual data show a wealth effect in the final 12 years, and 
why should that effect have been dominant for the final 25 years of 
the quarterly data? 1 have no persuasive answers to these questions 
raised by the comparison of the quarterly and the annual data. 


V. Conclusion 

The purpose of this paper has been to explore the role of the real 
stock price as a variable in the demand function for money. The 
results are suggestive but not conclusive. Quarterly data for the pe¬ 
riod since 1961 suggest that the real quantity of money (defined as 
M2) demanded relative to income is positively related to the real stock 
price, three quarters earlier, and negatively related to the contem¬ 
poraneous real stock price. The positive relation appears to reflect a 
wealth effect, the negative a substitution effect. The wealth effect 
appears stronger and is supported better by the data than the sub¬ 
stitution effect. The data contradict any effect on M2 velocity of the 
volume of total, current, or financial transactions, though, as indi¬ 
cated in Appendix B, there is such an effect on Ml velocity. 

Annual data for a longer period suggest that the apparent domi¬ 
nance of the wealth effect is the exception, not the rule. For all but the 
final 12 years of the century analyzed, the substitution effect appears 
to dominate any wealth effect. These results raise some puzzles that I 
have not been able to resolve. 


Appendix A 

Explanation of Peak in M2 Rate of Change in Second and Third 
Quarters of 1983 

1. The introduction of money market mutual deposit accounts (MMDA) led 
to a shift of deposits out of savings deposits (SD), time deposits (TI)), and 
money market mutual funds (MMMF), which would cancel in total M2. In 
addition, it led to noncanceling transfers from non-M2 funds. The following 
tables estimate the size of the shift. 
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Reported Levels in Indicated Months 


M2 

MM DA 

SD 

TD 

MMMF 

12/82 

1,952.6 

43.2 

357.9 

852.8 

185.2 

1/83 

2,007.1 

191.0 

333.0 

794.7 

168.2 

2/83 

2,044.7 

281.3 

321.6 

755.8 

160.6 

3/83 

2.061.5 

323.1 

318.6 

736.5 

154.8 




Change from Prior Month 


Excess 

MMDA 

Cumulative 

Excess 

M2 

MMDA 

SD 

TD 

MMMF 

1/83 

54.5 

147.8 

-24.9 

-58.1 

-17.0 

47.8 

47.8 

2/83 

37.6 

90.3 

-11.4 

- 38.9 

-7.6 

32.4 

80.2 

3/83 

16.8 

41.8 

-3.0 

- 19.3 

-5.8 

13.7 

93.9 


Adjusted M2 Annual Rate. 

(Reported M2 Minus Cumulative Excess) of Change 



M2 

Cumulative Excess 

Adjusted MU 

Original 

Adjusted 

12/82 

1,952.6 


1,952.6 

... 


1/83 


47.8 

1,959.3 

39.15 

4.20 

2/83 

2,044.7 

80.2 

1,964.5 

24.95 

3.23 

3/83 

2,061.5 

93.9 

1,967.6 

10.32 

1.91 

4/83 

2,078.2 

93.9 

1,984.3 

10.17 

10.67 


2. Quarterly estimates using adjusted M2: 





Annual Rate 




of Chance 


Reported M2 

Adjusted M2 

Original 

Adjusted 



1,938.0 





1,963.8 

22.25 

5.43 

2:31 


1,999.7 

11.41 

7.52 

1983:3 

2,130.8 

2,036.9 

7.30 

7.65 


3. To get consistent series, I multiplied all later M2’s by the ratio of adjusted 
to reported M2 for 1983:3 (.95593). 

4. This adjustment probably overstates the effect of the introduction of 
MMDAs since it implicitly assumes that there would have been no increase at 
all in SD + TD + MMMF if MMDAs had not been introduced. However, I 
suspect the error is minor, given that the adjustment suited earlier and 
continued later. 

5. These calculations were made in May 1986 on the basts of the then- 
current data. 
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Appendix B 

Ml Velocity and the Volume of Transactions 

In the course of exploring the effect of the volume of transactions on the 
velocity of M2,1 also made some calculations for Ml velocity. Since Ml comes 
closer than M2 to approximating a medium-of-exchange concept of money, 
there is reason to expect that the demand for M1 would be affected more by 
the volume of transactions than the demand for M2, and that has turned out 
to be the case. 

I have not made a detailed analysis for M1 velocity, but I report here some 
calculations based on quarterly data for 1970:1-1986:2 that are suggestive. 17 

The first conclusion, not recorded in the tables that follow, is that neither 
the differential yield on M1 (computed like R\ except using M1 instead of M2 
in the ratio of high-powered money to the slock of money) nor gy seems to lie 
a significant variable in the demand for M1; these returns on nominal and 
real assets are understandably far more significant for the broader concept of 
money than for Ml. On the other hand, the 3-month T-bill rate is a signifi¬ 
cant variable and so, clearly, is a time trend. 

At first impression, equation A in table B1 suggests that total transactions 
relative to GNP are an important variable. However, examination of charts of 
the basic data suggests that the first impression is misleading. As is well 
known, the upward trend in MI velocity came to a sharp halt in 1980 and was 
replaced by a declining trend. It so happens that total transactions accelerated 
sharply after 1980 thanks largely to an explosion in stock market activity and 
capital transactions. The negative coefficient attached to this acceleration in 
transactions offsets the positive coefficient of time. So the transactions vari¬ 
able is really contributing only one observation. Equation B tests this conjec¬ 
ture by replacing time by a combination of time to 1979:4 and a constant 
thereafter. The result is a statistically insignificant coefficient of the transac¬ 
tions variable and a lower standard error of estimate. Equation C show s that 
replacing total transactions by current transactions improves the correlation. 
I'he coefficient of the current transactions variable is highly significant statis¬ 
tically and the standard error is lower. 18 

Equation D, which includes capital and current transactions, gives an even 
lower standard error of estimate, with the coef ficients of both being negative 
and statistically significant, though the coefficient of the current transactions 
variable both is larger and has a higher /-value. 

As far as these regressions go, they give a somewhat mixed message. They 
support what I take to be the received view on current transactions, that their 
ratio to GNP has a sizable positive effect on the demand for the medium of 
exchange. However, with respect to capital transactions, regression D sug¬ 
gests that they too have a significant, if much smaller, effect, whereas 1 take 
the received view to be that financial transactions are so highly money- 


17 For the text of this paper, 1 have updated through 1986:4 the data used in nn 
initial explorations. For this appendix, 1 have not done so. 

18 In a correlation such as eq. A with a straight time variable, the logarithm of the 
ratio of current transactions to GNP has a positive rather than negative coefficient (of 
1.09) with a /-value of 3.6. However, the standard error of estimate is more than double 
that of eq. A. 
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efficient that they do not absorb any appreciable quantity of media of ex¬ 
change and have little if any effect on velocity. 19 

As a further check on the effect of the shift from accelerating inflation to 
disinflation, table B2 repeats regression A of table Bl, adds similar regres¬ 
sions for two subperiods, 1970:1—1979:4 and 1980:1-1986:2, and adds addi¬ 
tional regressions for the first period. 

Regressions F. and F confirm the conclusion from table Bl about the role 
played by the transactions variable for the period as a whole. For 1980:1- 
1986:2, neither total transactions nor time is statistically significant. However, 
the regressions for the earlier period do not confirm what I called the re¬ 
ceived view about the relative importance of capital and current transactions. 
According to these regressions, while the current transactions ratio alone 
does yield a statistically significant coefficient (/-value of 5.3), the /-value for 
the capital transactions ratio alone is even higher, and if the two ratios are 
included separately, the capital transactions ratio is dominant. There is essen¬ 
tially nothing to choose among equations F, H, and I. According to equation 
F, the elasticity of MI velocity with respect to the ratio of total transactions to 
GNP is - 0.36, and, according to equation I. with respect to the ratio of either 
current or capital transactions alone, -0.24. 

Disentangling the somewhat conflicting conclusions suggested by the re¬ 
gressions for the period as a whole and for the first subperiod will require 
additional evidence. The Federal Reserve may provide some such evidence if 
Paul Spindt is able to extend his estimates of the volume of transactions to the 
period prior to 1970. 
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Permanent and Temporary Components of 
Stock Prices 


Eugene F. Fama and Kenneth R. French 

Uncvtnity of Chicago 


A slowly mean-reverting component of stock prices tends to induce 
negative autocorrelation in returns. The autocorrelation is weak for 
the daily and weekly holding periods common in market efficiency 
tests but stronger for long-horizon returns. In tests for the 1926-85 
period, large negative autocorrelations for return horizons Ireyond a 
year suggest that predictable price variation due to mean reversion 
accounts for large fractions of 8-5-year return variances. Predict¬ 
able variation is estimated to be about 40 percent of 5-5-year return 
variances for portfolios of small firms. The percentage falls to 
around 25 percent for portfolios of large firms. 


I. Introduction 

Early tests of market efficiency examined autocorrelations of daily 
and weekly stock returns. Sample sizes for such short return horizons 
are typically large, and reliable evidence of nonzero autocorrelation is 
common. Since the estimated autocorrelations are usually close to 0.0, 
however, most studies conclude that the implied predictability of re¬ 
turns is not economically significant. Fama (1970) summarizes this 
early work, which largely concludes that the stock market is efficient. 

Summers (1986) challenges this interpretation of the autocorrela¬ 
tion of short-horizon returns. He argues that the claim in common 
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models of an inefficient market is that prices take long temporary 
swings away from fundamental values, which he translates into the 
statistical hypothesis that prices have slowly decaying stationary com¬ 
ponents. He shows that autocorrelations of short-horizon returns can 
give the impression that such mean-reverting components of prices 
are of no consequence when in fact they account for a substantial 
fraction of the variation of returns. 

Our tests are based on the converse proposition that the behavior of 
long-horizon returns can give a clearer impression of the importance 
of mean-reverting price components. Specifically, a slowly decaying 
component of prices induces negative autocorrelation in returns that 
is weak for the daily and weekly holding periods common in market 
efficiency tests. But such a temporary component of prices can induce 
strong negative autocorrelation in long-horizon returns. 

We examine autocorrelations of stock returns for increasing hold¬ 
ing periods. In the results for the 1926-85 sample period, large nega¬ 
tive autocorrelations for return horizons beyond a year are consistent 
with the hypothesis that mean-reverting price components are impor¬ 
tant in the variation of returns. The estimates for industry portfolios 
suggest that predictable variation due to mean reversion is about 35 
percent of 3-5-year return variances. Returns are more predictable 
for portfolios of small firms. Predictable variation is estimated to be 
about 40 percent of 3-5-year return variances for small-firm port¬ 
folios. The percentage falls to around 25 percent for portfolios of 
large firms. 

Our results add to mounting evidence that stock returns are pre¬ 
dictable (see, e.g., Bodie 1976; Jaffe and Mandelker 1976; Nelson 
1976; Fama and Schwert 1977; Fama 1981; Campbell 1987; French, 
Schwert, and Stambaugh 1987). Again, this work focuses on short 
return horizons (De Bondt and Thaler [ 1985) are an exception), and 
the common conclusion is that predictable variation is a small part 
(usually less than 3 percent) of the variation of returns. There is little 
in the literature that foreshadows our estimates that 25-45 percent of 
the variation of 3-5-year stock returns is predictable from past re¬ 
turns. 

There are two competing economic stories for strong predictability 
of long-horizon returns due to slowly decaying price components. 
Such price behavior is consistent with common models of an irrational 
market in which stock prices take long temporary swings away from 
fundamental values. But the predictability of long-horizon returns 
can also result from time-varying equilibrium expected returns gen¬ 
erated by rational pricing in an efficient market. Poterba and Sum¬ 
mers (1987) show formally how these opposite views can imply the 
same price behavior. The intuition is straightforward. 
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Expected returns correspond roughly to the discount rates that 
relate a current stock price to expected future dividends. Suppose 
tiiat investor tastes for current versus risky future consumption and 
the stochastic evolution of the investment opportunities of firms re¬ 
sult in time-varying equilibrium expected returns that are highly 
autocorrelated but mean-reverting. Suppose that shocks to expected 
returns are uncorrelated with shocks to rational forecasts of divi¬ 
dends. Then a shock to expected returns has no effect on expected 
dividends or expected returns in the distant future. Thus the shock 
has no long-term effect on expected prices. The cumulative effect of a 
shock on expected returns must be exactly offset by an opposite ad¬ 
justment in the current price. 

In this scenario, autocorrelated equilibrium expected returns lead 
to slowly decaying components of prices that are indistinguishable 
from the temporary price components of an inefficient market, at 
least with univariate tests like those considered here. More informed 
choices between the competing explanations of return predictability 
will require models that restrict the variation of expected returns in 
plausible ways, for example, models that restrict the relations between 
the behavior of macroeconomic driving variables and equilibrium ex¬ 
pected returns. 

Finally, tests on long-horizon returns can provide a better impres¬ 
sion of the importance of slowly decaying stationary price compo¬ 
nents, but the cost is statistical imprecision. The temporary compo¬ 
nent of prices must account for a large fraction of return variation to 
be identified in the univariate properties of long-horizon returns. We 
find “reliable” evidence of negative autocorrelation only in tests on 
the entire 1926-85 sample period, and the evidence is clouded by the 
statistical issues (changing parameters, heteroscedasticity, etc.) that 
such a long time period raises. 


II. A Simple Model for Stock Prices 

Let p(t) be the natural log of a stock price at time t. We model p(t) as 
the sum of a random walk, q{t), and a stationary component, z(t), 

p(0 = q(0 + 4t), (1) 

q(t) = q(t - 1) + p + t)(<), (2) 

where p is expected drift and t}(/) is white noise. Summers (1986) 
argues that the long temporary price swings assumed in models of an 
inefficient market imply a slowly decaying stationary price compo- 
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nent. As an example, he suggests a first-order autoregression (AR1), 
z(t) - <|>z(t - 1) + €(/), (3) 

where e(/) is white noise and <j> is close to but less than 1.0. 

The model (l)-(S) is just one way to represent a mix of random- 
walk and stationary price components. The general hypothesis is that 
stock prices are nonstationary processes in which the permanent gain 
from each month’s price shock is less than 1.0. Our tests are relevant 
for the general class of models in which part of each month’s shock is 
permanent and the rest is gradually eliminated. The tests center on 
the fact that the temporary part of the shock implies predictability 
(negative autocorrelation) of returns. 


A. The Implications of a Stationary Price Component 

Since p(t) is the natural log of the slock price, the continuously com¬ 
pounded return from t to t + T is 

r(t, t + T) = p(t + T) - p(t) 

(4) 

- [q(t + T) - q(t)) + [z(/ + T) - z(t)}. 

The random-walk price component produces white noise in re¬ 
turns. We show next that the mean reversion of the stationary price 
component z(t) causes negative autocorrelation in returns. 

The slope in the regression of z(t + T) - z(t) on z(t) - z(t - T), the 
first-order autocorrelation of T-period changes in z(t), is 

o(T ) = cov M + T ) ~ z ( 0 > *(<) ~ z (* ~ 7 ')1 / 5) 

M <x 2 [z« + T) - z(t)] 

The numerator covariance is 

cov[z(/ + T) - z(t), z(t) - z(t - D] = - tr 2 (z) + 2 cov[z(/), z(t + 7)] 

- cov[z(/), z« +27)]. 

(6) 

The stationarity of z(t) implies that the covariances on the right of (6) 
approach 0.0 as 7 increases, so the covariance on the left approaches 
-cr 2 (z). The variance in the denominator of the slope, 

or 2 [z(t + 7) - z(f)] = 2cr 2 (z) - 2 cov(z(< + 7), z(<)], (7) 

approaches 2tr 2 (z). We can infer from (6) and (7) that the slope in the 
regression of z(t + 7) - z(<) on z(t) - z(t - 7) approaches - 0.5 for 
large 7. 
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The slope p(T) has an interesting interpretation used often in the 
empirical work of later sections. If z(t) is an AR1, the expected change 
from t to T is 

E,[z{t + T) - 2(0] = (4>' f - l)z(0. (8) 

and the covariance in the numerator of p(jT) is 
cov(z(t + T) - 2(0. 2(0 - Z(t - 7)1 = (- 1 + 2<j> r - <|> 2 V‘(2) 

= -(1 - 4> r )V<2). 

(9) 

With (8) and (9) we can infer that the covariance is minus the variance 
of the T-period expected change, - a 2 [£,z(/ + T) - i(/)J. Thus, when 
z(0 is an AR1, the slope in the regression of z{t + T) - z(0 on z(0 - 
z(t - T ) is (minus) the ratio of the variance of the expected change in 
z(0 to the variance of the actual change. This interpretation of the 
slope is a valid approximation for any slowly decaying stationary pro¬ 
cess. 1 

Equation (8) shows that when <j> is dose to 1.0, the expected change 
in an AR1 slowly approaches -z(t) as T increases. Likewise, the slope 
f>(T) is close to 0.0 for short return horizons and slowly approaches 
-0.5. This illustrates Summers’s (1986) point that slow mean rever¬ 
sion can be missed with the short return horizons common in market 
efficiency tests. Our tests are based on the converse insight that slow 
mean reversion can be more evident in long-horizon returns. 


B. The Properties of Returns 

Since we do not observe z(f), we infer its existence and properties 
from the behavior of returns. Let |3(7’) be the slope in the regression 
of the return r(l, t + T) on r(t - T, t). If changes in the random-walk 
and stationary components of stock prices are uncorrelated. 


Rm - cov t r (*. t + T), r(t - T, t)] 
o*[r(t - T, t)) 

= p(7> 2 [z(/ + T) - z(/)] 

o 2 [z(/ + 7) - 2(0] + + T) - q(t)] 


( 10 ) 

(10a) 


1 For long return horizons, the interpretation of the slope as the proportion of the 
variance of the change in z(t) due to the expected change is valid for any stationary 
process. If z(t) is a stationary process with a zero mean, the expected change from t to T ap¬ 
proaches - 1 (<) as T increases, and the variance of the expected change approaches 
«r*(z). The ratio of the long-horizon variance of the expected change in z(t), «r*(r), to the 
long-horizon variance of the actual change, 2cr z (z), is thus 0.5, the negative of the long- 
horizon value of p(T). 
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Expression (10b) highlights the result that 3(7) measures the propor¬ 
tion of the variance of 7-period returns explained by (or predictable 
from) the mean reversion of a slowly decaying price component z(t). 
Expression (10a) helps predict the behavior of the slopes for increas¬ 
ing values of 7. If the price does not have a stationary component, the 
slopes are 0.0 for all 7, If the price does not have a random-walk 
component, 3(7') = p(7) and the slopes approach -0.5 for large 
values of 7. 

Predictions about the slope 3(7) are more complicated if the stock 
price has both random-walk and stationary components. The mean 
reversion of the stationary component tends to push the slopes to¬ 
ward - 0.5 for long return horizons, while the variance of the white- 
noise component, q(t + T) - q(t), pushes the slopes toward 0.0. Since 
the variance of z(t + 7) - z(t) approaches 2o 2 (z) as the return horizon 
increases and the white-noise variance grows like 7, the white-noise 
component eventually dominates. Thus, if stock prices have both 
random-walk and slowly decaying stationary components, the slopes 
in regressions of r(t, t + T) on r(T - t, t ) might form a U-shaped 
pattern, starting around 0.0 for short horizons, becoming more nega¬ 
tive as 7 increases, and then moving back toward 0.0 as the white- 
noise variance begins to dominate at long horizons. 

Finally, existing evidence (e.g., Fama and Schwert 1977; Keim and 
Stambaugh 1986; Fama and French 1987; French et al. 1987) sug¬ 
gests that expected returns are positively autocorrelated. The nega¬ 
tive autocorrelation of long-horizon returns due to a stationary com¬ 
ponent of prices is consistent with positively autocorrelated expected 
returns. For example, the model (l)-(3) implies negatively autocor¬ 
related returns. Poterba and Summers (1987) show, however, that if 
the stationary price component z(t) in (3) is an AR1 with parameter 
<}> > 0.0, the expected return is an AR1 with parameter <t> and so is 
positively autocorrelated. The economic intuition is that shocks to 
expected returns (discount rates) can generate opposite shocks to 
current prices, and returns can be negatively autocorrelated when 
expected returns are positively autocorrelated. 

III. The Autocorrelation of Industry and Decile 
Portfolio Returns 

A. The Data 

The mix of random-walk and stationary components in stock prices 
can differ across stocks. Firm-size and industry are dimensions known 
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to capture differences in return behavior (see King 1966; Banz 1981; 
Huberman and Kandel 1985). We examine results for industry port¬ 
folios and for portfolios formed on the basis of size. 

The basic data are 1-month returns for all New York Stock Ex¬ 
change (NYSE) stocks for the 1926-85 period from the Center for 
Research in Security Prices. At the end of each year, stocks are ranked 
on the basis of size (shares outstanding times price per share) and 
grouped into ten (decile) portfolios. One-month portfolio returns, 
with equal weighting of securities, are calculated and transformed 
into continuously compounded returns. These nominal returns are 
adjusted for the inflation rate of the U.S. Consumer Price Index (CPI) 
and then summed to get overlapping monthly observations on 
longer-horizon returns. Unless otherwise noted, return henceforth 
implies a continuously compounded real return. 

There is a problem with the decile portfolios. Stocks with unusually 
high or low returns tend to move across deciles from one year to the 
next. If unusual returns are caused by temporary price swings, subse¬ 
quent reversals may be missed—the tests may understate the impor¬ 
tance of stationary price components—because of the movement of 
stocks across deciles. Since the problem is less severe for portfolios 
that include all stocks, we also show results for the equal- and value- 
weighted portfolios of all NYSE stocks. The value-weighted market 
portfolio summarizes the return behavior of large stocks, while the 
equal-weighted portfolio is tilted more toward small stocks. 

Using Standard Industrial Classification codes, we also form 17 
industry portfolios, with equal weighting of the stocks in a portfolio. 
One criterion in defining an industry is that it contains firms in similar 
activities. The other criterion 'is that the industry produces diversified 
portfolios during the 1926-85 period. Each of the 17 industries al¬ 
ways has at least seven firms (15 after 1929), and the number of firms 
per industry is usually greater than 30. Within industries, there is 
little concentration of firms by size. For example, the average of the 
decile ranks of the firms in an industry is typically between 4.0 and 
7.0. Thus size and industry are not proxies, and size and industry 
portfolios can provide independent evidence on the behavior of long- 
horizon returns. (Details on the industry portfolios are available from 
the authors.) 

The tests center on slopes in regressions of r(t, t + T) on r(t - T, t). 
The slopes are first-order autocorrelations of T-year returns. Ordi¬ 
nary least squares (OL5) estimates have a bias that depends on the 
true slopes, sample sizes, and the overlap of monthly data on long- 
horizon returns (see Kendall 1954; Marriotand Pope 1954; Huizinga 
1984). Proper bias adjustments when the true slopes are 0.0 (prices do 
not have stationary components) are difficult to determine analyt- 
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ically. We use simulations, constructed to mimic properties of stock 
returns, to estimate the bias adjustments (see the Appendix). The 
simulations also show that when prices have stationary components 
that generate negative autocorrelations on the order of those ob¬ 
served here, simple OLS slopes have little bias. We examine both OLS 
and bias-adjusted slopes. 


B. Regression Slopes for the 1926-85 Sample Period 
Industries 

Table 1 shows slopes in regressions of r(t, t + T) on r(t — T, t ) for 
return horizons from 1 to 10 years, using the industry portfolio data 
for the 1926-85 sample period. As predicted by the hypothesis that 
prices have stationary components, negative slopes are the rule. The 
bias-adjusted slopes are uniformly negative for return horizons from 
2 to 5 years. The unadjusted slopes are almost always negative for all 
horizons. The slopes reach minimum values for 3-5-year returns, 
and they become less negative for return horizons beyond 5 years. 
This U-shaped pattern is consistent with the hypothesis that stock 
prices also have random-walk components that eventually dominate 
long-horizon returns. Estimated slopes (not shown) for nominal re¬ 
turns are usually within 0.04 of those for real returns. 

The slopes for 3-, 4-, and 5-year returns are large in magnitude and 
relative to their standard errors. The average values of the bias- 
adjusted slopes for 3-, 4-, and 5-year returns are -0.30, -0.34, and 
-0.32; the averages of the unadjusted slopes are -0.38, -0.45, and 
-0.45. Expression (10b) says that the slope measures the proportion 
of the variance of T -year returns due to time-varying expected re¬ 
turns generated by slowly decaying stationary price components. The 
slopes for the industry portfolios thus suggest that these time-varying 
expected returns average between 30 percent and 45 percent of the 
variances of 3-5-year returns. 

Moreover, the limiting argument for the slopes in Section II says 
that the variance of the expected change in the stationary price com¬ 
ponent z(t) approaches half the variance of the long-horizon change 
in z(t). Thus regression slopes that average between - 0.30 and -0.45 
estimate that, on average, between 60 percent and 90 percent of the 
variances of 3-5-year industry returns are due to the stationary price 
component z(t). 

A caveat is in order. The hypothesis that prices contain both 
random-walk and slowly decaying stationary components predicts a 
U-shaped pattern of slopes for increasing return horizons. This pro¬ 
vides some justification for taming toward extreme slopes to estimate 
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proportions of return variances due to the two components of prices. 
Since we do not predict the return horizons likely to produce extreme 
slopes, however, using the observed extremes to estimate proportions 
of variance probably overstates the importance of stationary compo¬ 
nents of prices. 

Moreover, a pervasive characteristic of the tests is that small effec¬ 
tive sample sizes imply imprecise slope estimates for long-horizon 
returns. The large standard errors of the industry slopes (averaging 
0.11 for 1-year returns and 0.26 for 10-year returns) leave much 
uncertainty about the true slopes and thus about the proportions of 
variance due to the random-walk and stationary components of 
prices. (See the Appendix for pertinent details.) 


Deciles 

There is no obvious pattern in the variation of the regression slopes 
across industries. There is a clearer pattern in the slopes for the decile 
portfolios in table 2. Like the industry slopes, the decile slopes are 
negative and large for 2-5-year returns. However, the minimum 
values of the slopes tend to be more extreme for lower (smaller-firm) 
deciles. All the bias-adjusted slopes less than - 0.30 and all the unad¬ 
justed slopes less than -0.37 are generated by the equal-weighted 
market portfolio and deciles 1-7. Most of the 4- and 5-year bias- 
adjusted slopes for these portfolios are more than 2.0 standard errors 
below 0.0. The value-weighted market and the larger-firm deciles 9 
and 10 produce no bias-adjusted slopes more than 2.0 standard er¬ 
rors below 0.0. 

Again, perspective is in order. The large standard errors of the 
decile slopes—between 0.13 and 0.20 for 3-5-year returns—mean 
that if stock prices have stationary components, they must generate 
large negative slopes (and account for large fractions of variance) to 
be identified reliably, even when the estimates use the entire 1926-85 
sample period. Nevertheless, every decile produces a simple OLS 
slope for 3-, 4-, or 5-year returns more than 2.0 standard errors below 
0.0. And the U-shaped pattern of the slopes across return horizons 
predicted by the hypothesis that prices have both random-walk and 
slowly decaying stationary components is observed for all the deciles, 
the industry portfolios, and the two market portfolios. 

We conclude that the tests for 1926-85 are consistent with the 
hypothesis that stock prices have both random-walk and stationary 
components. The estimates suggest that stationary price components 
account for large fractions of the variation of returns and that they 
are relatively more important for small-stock portfolios. We recog¬ 
nize, however, that the imprecision of the tests implies substantial 
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uncertainty about any interpretation of the results. The relevance of 
this caveat is obvious in the subperiod results that follow. 

C. Subperiod Autocorrelations 

Because the regression slopes are not estimated precisely, the results 
for the 1926-85 period are in principle the strongest test of the 
hypothesis that stock prices have stationary components. There are, 
however, reasons to examine subperiods. First, return variances drop 
substantially after 1940 (see Officer 1973; French et al. 1987). The 
variance changes make inference less precise even if the autocorrela¬ 
tions of returns are stationary. Moreover, the high variances of the 
early years are associated with large price swings. It is possible that the 
large negative autocorrelations estimated for 1926-85 are a conse¬ 
quence of the early years. 

We have estimated the slopes in the regression of r(t. t + T) on 
r(t - T, l) for the 30-year splits, 1926-55 and 1956-85, and for the 
longer 1946-85 and 1941-85 periods. The estimates for 1941-85 are 
in tables 3 and 4. We choose 1941-85 because it is the longest period 
of roughly constant return variances. The regression slopes it pro¬ 
duces are similar in magnitude and pattern to those for 1946-85 and 
1956-85. 

Like 1926-85, the 1941-85 period produces a general pattern of 
negative autocorrelation of returns that is consistent with the hy¬ 
pothesis that prices have stationary components. However, the 1941 — 
85 bias-adjusted slopes are typically closer to 0.0, and they do not 
produce the strong U-shaped pattern across return horizons observed 
for 1926-85. Moreover, large standard errors (averaging 0.13 for 
1-year industry portfolio returns and 0.27 for 8-year returns) make 
the hypothesis that prices contain no stationary components (the true 
slopes are 0.0) difficult to reject. 

Large standard errors make most hypotheses about subperiods dif¬ 
ficult to reject. For example, slope estimates for 1926-55 (not shown) 
have an even stronger U-shaped pattern than those for 1926-85, 
while estimates for 1956-85 (also not shown) are much like those for 
1941-85. However, the hypothesis that the slopes for 1926-55 and 
1956-85 are equal cannot be rejected; indeed, large standard errors 
make the hypothesis essentially untestable. 

In short, the preponderance of negative slopes observed for all 
periods (shown and not shown) is consistent with the hypothesis that 
stock prices have stationary components that generate negative auto¬ 
correlation in long-horizon returns. Subperiod slopes suggest that the 
negative autocorrelation is weaker (stationary price components are 
less important in the variation of returns) after 1940. But reliable 
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contrasts across periods are impossible. Perhaps stationary price com¬ 
ponents are less important after 1940. Perhaps prices no longer have 
such temporary components. Only time (and lots of it) will tell. 2 


IV. Negative Autocorrelation: Common or 
Firm-specific Factors? 

An important economic question is whether the negative autocorrela¬ 
tion of long-horizon returns is due to common or firm-specific fac¬ 
tors. Evidence that the autocorrelation is due to common factors 
would raise the possibility that it can be traced to common macroeco¬ 
nomic driving variables. On the other hand, evidence that the auto¬ 
correlation is firm-specific would raise the possibility that expected 
returns have firm-specific components. Such a finding would chal¬ 
lenge the relevance of parsimonious equilibrium pricing models. We 
summarize briefly some preliminary work on these issues. 

A. Portfolios 

Evidence that a single portfolio absorbs the negative autocorrelation 
of returns for all the industry and decile portfolios would suggest that 
the negative autocorrelation of the 1926-85 period is due to one 
common factor. We have estimated residual autocorrelations for re¬ 
gressions of the decile and industry portfolio returns on the return to 
decile 1. We choose decile 1 as the explanatory portfolio because of 
the evidence in table 2 that the general negative autocorrelation of 
aortfolio returns is a larger fraction of the variation of returns on 
xmfolios of small firms. 

For the 1926-85 period, the residual autocorrelations for the in¬ 
dustry and decile portfolios are more often positive than negative, but 
they are typically close to 0.0. Results for other periods are similar. 
The evidence suggests that the general negative autocorrelation of 
portfolio returns is largely due to a common macroeconomic phe¬ 
nomenon. 


2 We have also tested the autocorrelation of returns by examining return variances 
for increasing horizons (see Alexander 1961; Cochrane 1986; French and Roll 1986; 
jo and MacKinlay 1986). Return variances for the industry and decile portfolios be¬ 
have as predicted by the hypothesis that stock prices have stationary components that 
induce negative autocorrelation in returns. In particular, the variances grow less than 
in proportion to the return horizon. Unlike the regression slopes in tables 1 and 2, 
however, the variance tests for the 1926-85 period do not reliably identify negative 
autocorrelation in long-horizon returns. We mention the variance tests to emphasize 
that different univariate .approaches to identifying slowly decaying stationary compo¬ 
nents of price have the common problem of low statistical power—a point treated in 
detail in Poterba and Summers (1987). 
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B. Individual Securities 

Since the decile and industry portfolios are diversified, firm-specific 
factors contribute little to the variation of their returns. Tests for 
autocorrelation due to firm-specific factors must use individual stocks. 
A problem, however, is that reliable inferences about long-horizon 
returns require long sample periods, but the population of NYSE 
stocks changes through time. Our preliminary solution is to study the 
82 stocks listed for the entire 1926-85 period. 

The equal-weighted portfolio of the 82 stocks produces a strong 
U-shaped pattern of autocorrelations like that observed for the equal- 
weighted portfolio of all stocks in table 2. The bias-adjusted autocor¬ 
relations for 2-, 3-, and 4-year returns on this portfolio are -0.26, 
— 0.36, and -0.28, and they are at least 1.99 standard errors below 
0.0. The autocorrelation of returns on the 82 individual stocks is 
weaker. The averages of the (82) 2-, 3-, and 4-year bias-adjusted 
slopes are -0.10, —0.16, and -0.10, and the slopes are, on average, 
-0.78, - 1.09, and -0.76 standard errors from 0.0. Moreover, even 
the weak negative autocorrelation in the individual stock returns dis¬ 
appears in the residuals from regressions of the returns on decile 1. 
The average bias-adjusted residual autocorrelations for the 82 stocks 
are close to 0.0 for all return horizons, and the cross-sectional distri¬ 
butions of the autocorrelations are roughly symmetric about 0.0. 
Tests on the 230 stocks listed for the 1941-85 period yield similar 
results. 

Heavy-handed conclusions from these rather special safriples of 
survivors are inappropriate. But the fact that returns on portfolios of 
the survivors have autocorrelations similar to those of the equal- 
weighted market portfolio gives some confidence in the individual 
stock evidence that firm-specific components of long-horizon stock 
returns have no autocorrelation. The results are heartening for pro¬ 
ponents of parsimonious equilibrium pricing models. 


V. Conclusions 

First-order autocorrelations of industry and decile portfolio returns 
for the 1926-85 period form a U-shaped pattern across increasing 
return horizons. The autocorrelations become negative for 2-vear 
returns, reach minimum values for 3-5-year returns, and then move 
back toward 0.0 for longer return horizons. This pattern is consistent 
with the hypothesis that stock prices have a slowly decaying stationary 
component. The negative autocorrelation of returns generated by a 
slowly decaying component of prices is weak at the short return hori¬ 
zons common in empirical w„ork, but it becomes stronger as the return 
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horizon increases. Eventually, however, random-walk price compo¬ 
nents begin to dominate the variation of returns, and long-horizon 
autocorrelations move back toward 0.0. 

Autocorrelation may reflect market inefficiency or time-varying 
equilibrium expected returns generated by rational investor behavior. 
Neither view suggests, however, that patterns of autocorrelation 
should be stable for a sample period as long as 60 years. Although a 
tendency toward negative autocorrelation of long-horizon returns is 
always observed, subperiod results suggest that the strong negative 
autocorrelation of the 1926-85 period may be largely due to the first 
15 years. Autocorrelations for periods after 1940 are closer to 0.0, 
and they do not show the U-shaped pattern of the overall period. 
Because sample sizes for long-horizon returns are small, however, 
sample autocorrelations cannot identify changes in the time-series 
properties of returns. Stationary price components may be less im¬ 
portant after 1940, or perhaps prices no longer have such temporary 
components. Resolution of this issue will require more powerful sta¬ 
tistical techniques. 


Appendix 

Simulations 

The simulations mimic properties of NYSE returns. Monthly simulated re¬ 
turns are summed to get overlapping monthly observations on /'-year re¬ 
turns, r(t, l + T). We estimate regressions of simulated returns r(t, t + T) on 
lagged returns r(t, t - T) to obtain sampling distributions of the slopes. The 
simulations use 720 observations per replication, matching the number of 
months in the 1926-85 sample period. 

We simulate two models in which the true slopes are 0.0. One is a random 
walk in the log price with normal (0, 1) monthly returns. The second is a 
random walk in which return variances change every 2 years to approximate 
changes in stock return variances (see table A I). We also simulate constant 
and changing variance versions of the model (1)—(3) in which the log price 
has both a random-walk and an AR1 component (see table A2). 

A. The Random-Walk Simulations 

Table A1 summarizes estimates of regression slopes for the random-walk 
models. The negative bias of OLS slopes is apparent from the average slopes 
in the first line of the table. The bias increases with the return horizon be¬ 
cause effective sample sizes are smaller for longer horizons and because the 
increased overlap of the observations increases serial dependence. 

The second line of the table shows average bias-adjusted slopes. The bias 
correction is the average slope for each return horizon from 10,000 prelimi¬ 
nary replications of the constant-variance random-walk model. These bias 
corrections are used whenever we refer to bias-adjusted slopes for the 1926- 
85 period for NYSE returns or for the simulations. Since the average bias- 
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adjusted slopes in table A1 are dose to 0.0, the preliminary simulations give 
good estimates of bias when monthly returns are white noise. The bias correc¬ 
tions for the 1941—85 period (540 months) used in the text are also average 
slopes from 10,000 preliminary replications of the random-walk model, but 
with 540 rather than 720 observations per replication. 

The standard deviation of the sample of slopes for each return horizon in 
table A] estimates the standard error of the slope. The standard deviations 
are large, for example, 0.24 for 5-year returns. Since 720 months yield 12 
nonoverlapping 5-year returns, the slope standard error for nonoverlapping 
•eturns would be (1/11) 5 = 0.30. The standard error 0.24 for 600 monthly 
observations on 5-year returns implies an effective sample size of 18.4 non¬ 
overlapping returns. 

The /’s in table A1 for tests of bias-adjusted slopes against 0.0 use standard 
errors adjusted for residual autocorrelation due to return overlap (see Han¬ 
sen and Hodrick 1980). Lower fractiles of the fs are estimates of critical 
values for tests of the hypothesis that the slope is 0.0 (prices are random 
walks) against the alternative that the slope is negative because the price has a 
temporary component. The t 's for the changing-variance random walk are 
most relevant, given the changing variances of stock returns. For 3-5-year 
returns, the .10, .05, and .01 fractiles of the t’s are around - 1.8, -2.3, and 
-3.5. These are more extreme than the same fractiles of the unit normal, 
- 1.28, - 1.65, and - 2.33. Standard deviations around 1.3 also show that the 
simulation t’s have more dispersion than the unit normal. 

Comparison with part A of table A1 shows that the higher dispersion of the 
t’s in part B is due to changing variances. We have estimated slope standard 
errors using the method of White (1980) and Hansen (1982) to jointly correct 
for autocorrelation and heteroscedasticity. Resulting t’s show more dispersion 
and more extreme negative lower fractiles than t’s based on Hansen and 
Hodrick’s (1980) standard errors. In the NYSE returns, we use Hansen and 
Hodrick’s standard errors. 


B. The Mixed ARl-Random-Walk Simulations 

Table A2 summarizes simulations of the mixed ARl-random-walk model. 
True slopes that drop from -0.10 for 1-year returns to -0.27 for 5-year 
returns are similar to the slopes estimated for the value-weighted NYSE mar¬ 
ket portfolio in table 1. We view the simulations as evidence about the power 
of the tests to reject the random-walk model when prices have stationary 
components that imply slopes in the 3-5-year range like those observed for 
NYSE returns. 

Under the random-walk hypothesis, t's tor tests of bias-adjusted slopes 
against 0.0 are relevant. Average t’s in part B of table A2 are only around 
— 1.18 for 3-5-year returns. Likewise, the fractiles of the slopes for the mixed 
ARl-random-walk model in part B of table A2 are somewhat to the left of 
those for the random-walk model in part B of table A1, but the overlap of the 
distributions is substantial. In short, large standard errors for the slopes (the 
standard deviation of the 5-year slopes in pt. B of table A2 is 0.18) mean that 
the regression tests have little power to reject the random-walk model when 
prices have a stationary component that accounts for 27 percent of the vari¬ 
ance of 5-year returns. Stationary components of stock prices must generate 
large negative slopes to be identified reliably in our tests. 

Finally, consistent with Kendall (1954) and Marriot and Pope (1954), table 
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B. «</. t + T) Heteroscedastk. (Changing Variance): 

Mean and Standard Deviation of (3 (OLS and Bias-adjusted); Fractiles (Bias-adjusted) 
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A2 shows that simple OLS slopes are less biased when the true slopes are 
negative, and bias corrections relevant when the true slopes are 0.0 produce 
estimates biased toward 0.0. For example, the true 5-year slope in part B of 
table A2 is - 0.27, the average of simple OLS slopes is - 0.30, and the aver¬ 
age bias-adjusted slope is —0.17. Thus simple OLS slopes are closer to un¬ 
biased when the log price has an important AR1 component. 
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Mathematical Vindication of Ricardo 
on Machinery 


Paul A. Samuelson 

Massachusetts Institute of Technology 


Ricardo is shown to be right that machinery can hurt wages and 
reduce output. A dramatic robot example reveals Wicksell’s error in 
believing that Pareto optimality calls for no drop in total output front 
a viable invention. Under Ricardo's axiom that labor supply adjusts 
to keep wages at the subsistence level, he can correctly deduce on a 
market-clearing basis a rise in his net product (rent plus interest), 
while the greater drop in population and total wages results in a 
reduction in his gross product (rent plus interest plus wages). 


The chapter on machinery that Ricardo added to the third edition of 
his Principles (1821) has generally been suspect among his contem¬ 
porary economists and his posterity. 1 regard this suspicion as 
unjustified and consider it the best single chapter in this overpraised 
work. 

Presented here is a simple classical scenario in which the invention 
of a robot machine does, as he said was possible, reduce the demand 
for labor permanently, reduce the total of wages, reduce what 
Ricardo defines to be the gross product, and cause the population to 
decline. Moreover, the scenario takes place along the lines of his 
arithmetic example. The behavior equations underlying my model 
are precisely those that have been used by Pasinetti (I960) and 
Samuelson (1959a, 19596, 1978) to model the Ricardian system: rent 
as a residual and not explicitly as a Clarkian marginal product, wage- 
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RICARDO ON MACHINERY 275 

fund elements, market-clearing full employment, a single-good (corn) 
model, or a many-good model. 

Since the literature has been in doubt on the logical possibility of 
what Ricardo contended, it is appropriate that I should fabricate an 
overdramatic example of the starkest type that provides an instance 
vindicating Ricardo's reasoning. I have done this by using robots that 
decimate human labor in a corn-only world. The reader will realize 
that, once the Ricardian contention is demonstrated to be logically 
feasible, with trifling ingenuity 1 can manufacture ad lib examples of 
it in elaborated multicommodity cases that simulate realistic scenarios 
in economic history. 

Assumptions 

The labor supply, /-,, remains constant when the real corn wage is at 
the subsistence level of w : by dimensional license this can be a corn 
wage per worker of unity. When the actual wage is above the subsis¬ 
tence rate, population grows; when it is below, population declines: 

L, + ] - L, - a(w, - 1), (1) 

where a may be a positive constant or any function of ( w ,, L,) that is 
positive. 

Initially corn is produced by labor working on fixed available 
acreages of land, assumed to tail off in quality continuously so that 
there are always observable external margin zero-rent acreages just 
worth cultivating. The competitive wage rate for all workers, w,, 
equals the corn product produced on the external margin by a worker 
there but discounted at the market rate of interest r, because workers 
get paid at the beginning of the production periods while com output 
becomes available at the end. 

Mathematically, as is well known. 


= f(L,h f > 0 >/*, 

(2) 

Wage rate - u>, ~ ^ + 

(3) 

Rent = /(/,,) - w,(l + r t )L t , 

(4) 


where Q is corn output; L is labor input; w is the real wage rate, u» = 1 
being its subsistence level; r is the interest rate; / is the production 
function giving total Q for each level of L spread competitively among 
the diverse acres; and the derivative f'{L) is what a worker produces 
of com at the no-rent margin. 

In initial long-run equilibrium, when t subscripts are ignored be¬ 
cause all variables are stationary, for each long-run r, we have 
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/'(!)= 1 + ?, L = 100 (say), 

Q. = m =/(100) - 220 (say), 

Wages - wL ~ 100, 

Rent =/(L) - tt»(l + r)A = 220 - (1 + r)100 (5) 

- 220 - 120 (say) - 100, 
r = .20 (a 20 percent interest rate per period). 

Interest = T(wl.) = .2(100) = 20. 

Of the 220 of gross corn output, Ricardo counts the 100 + 20 con¬ 
sumed by property owners (landowners’ 100 and capitalists’ 20) as net 
output. He treats the 100 of subsistence corn for humans as fodder, a 
necessary cost of producing the net product. Net output for him 
equals Rent plus Interest and falls short of gross output, which in¬ 
cludes Wages; K.uznets-Haig national income is Ricardo's gross prod¬ 
uct. 

It would not matter if I introduced an invention of robot machines 
into a growing Ricardian system rather than into an initial long-run 
equilibrium. But it is useful to show that Ricardo’s argument does not 
need to depend at all on short-run frictions or transitional technologi¬ 
cal unemployments. (Also, the initial interest rate could as well be 
zero as 20 percent if property owners continued to save positively at 
any secure positive rate of interest. The reader may replace r = 0.2 by 
r — 0.0 in the fashion of Schumpeter and Kalecki-Marx.) 

Now let there be invention of a robot machine that lasts one period 
and can do exactly the work of one man. Let one new machine be 
producible by exactly the labor-land resources that produce 1 - e 
units of corn. This instance in which a machine is definitely cheaper 
than a man provides the starkest case for Ricardo's logic. 

After the invention, we rewrite (2) to take account of the number of 
machines, writing for gross product (in terms of the corn numeraire) 

Q t+1 + (1 - e)K t+i =f(L t +K,). (2') 

How (3) and (4) must be rewritten will soon be seen. 

At the old pricing equilibrium, even without any new desire to save, 
after the invention it will pay to divert some of the resources being 
used to produce corn to the task of producing some new robots—a 
switch from the wage fund to fixed capital, just as Ricardo’s example 
called for. If 1 - e is little less than unity, the new profit rate will be 
little different from that of the previous equilibrium; but ever so little 
a difference will motivate the described shift of some resources away 
from corn production. (So we see that those writers who tried to 
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criticize Ricardo’s arithmetic example on the ground that it presumed 
no realistic rise in the profit rate from the innovation were beside the 
mark: the rise in r can be large or little.) 

As soon as the diversion of resources away from corn reduces its 
total output, on the reasonable assumption that property owners do 
not massively abstain from corn consumption to finance the robot 
capital formation, there will be less left over in the wage fund than 
before: now fewer than 100 corn bids for the existing population of 
100. So the market-clearing real wage falls at least temporarily below 
the subsistence level. Ricardo’s equation (1) dictates a falling off of 
population—as he says, a redundant population. 

It is well that the wage falls since otherwise the higher interest rate 
occasioned by the invention could not be earned on the corn ad¬ 
vanced to an unchanged number of workers. Marx properly criticized 
Ricardo for the calm way he faced the Malthusian destruction of 
people and the abortion of natural fertility. But it is Ricardo’s story, 
and we must let him tell it his own way. As the stock of robots gets 
built up, the wage fund shrinks and necessarily shrinks fast enough to 
match the declining population and to keep the wage rate below the 
population maintenance level. 

The land will come to be cultivated even more intensively titan 
before, but more and more of it is used to replace robots and to 
provide increments to the stock of robots. And more and more of the 
acres are being tilled by robots rather than by people. 


Long-Run Equilibrium 

Before I sketch the dynamic path of transition, let me describe the 
terminal equilibrium after all adjustments to the robot invention have 
been made. 

For simplicity, I follow Ricardo’s assumption that there is always 
some long-run effective interest rate at which it just pays to keep 
capital intact and save zero net. Let this r be as before: 20 percent in 
my example, perhaps 0 percent in the reader’s variant of it. 

Theorem. In the new equilibrium, all labor will have been rendered 
redundant. Human labor will have died out. The stock of robots will 
necessarily be so large as to extend the margin of cultivation to worse 
lands than in the original equilibrium; Ricardo’s net product, his Rent 
plus Interest that property owners spend in the new steady state on 
their corn consumption, will be higher after the invention than ex 
ante as long as f is sufficiently small. The Kuznets national product, 
Wages + Interest + Rent, will (as Ricardo claimed and as Wicksell 
denied to be possible) have-fallen in any situation in which the labor 
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saving is not too extreme (in the sense that 1 - c is not too minus¬ 
cule). 

When we recognize products other than corn, Ricardo recognizes 
that less of the labor supply will have to be reduced, especially when 
property owners spend their enhanced income on personal service 
(retainers etc.); but still, for e not too large, his gross product does fall. 

The conditions of postinvention equilibrium for the stock of robot 
machines K, the labor supply L, and the output of corn Q are 

Q + (1 - t)K = f{K + 0), 

/'(* + 0) = (1 + F)(l - €) < 1 + r, 

1 + r 


(6a) 

(6b) 

(6c) 


Because the wage that a worker would earn on the external margin, 
which is the same as the “net rental” an equivalent robot earns there, 
is seen in (6c) to be always below the subsistence wage, no farm work¬ 
ers at all can survive! Therefore, 


L = 0, 

f(K) <f( 100), 

from (6b), 

K > 100 = previous L, 

Rent in com = f(K) - Kf' (K) 

>/( 100) - 100/'(100) = previous rent. 


(6d) 

(6e) 

(6f) 

(6g) 


A more triumphant vindication of Ricardo could hardly be possi¬ 
ble, and this within his own concepts and in the longest-run equilib¬ 
rium. 1 


Graphical Vindication 

Figure 1 is the standard Ricardo diagram of Kaldor (1956), Samuel- 
son (1978), and many others. Though it resembles a Clark marginal 
product diagram, it needs only the classical concept of external mar- 

1 One must never go overboard in praising the uneven Ricardo. As Stephen Leacock 
would say, Ricardo was away from school the day they taught the difference between 
necessary and sufficient conditions. It is not required that gross product be reduced by 
an invention if the demand for labor is to be hurt by that invention, but his way of 
seeing this last possibility was by way of the former possibility. For large c the rent alone 
can exceed the preinvention gross product, and yet this case of increased gross product 
is most quickly harmful to labor! Also, for f > 0, Ricardo erred if he thought that every 
viable invention must increase his net product 
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BEFORE MACHINES AFTER ROBOTS 



Fig. I.—Note that Rent' > Rent because G’ > G, Interest' > Interest, and land is 
more intensively cultivated in the postinvention equilibrium. Note that population has 
been made extinct then. New gross output DE'F's will be below old DEGO whenever E' 
is near to E —in vindication of Ricardo vis-4-vis Wicksell (OS' is 1 - t times the length 
of OS; Os' is 1 - * times Os). 


gin land. The new element here is that robots are zero before the 
invention, and it is deduced that labor must be zero after the system 
adjusts to the new robot equilibrium. If robots had not been such 
perfect human substitutes, the reduction of population' would of 
course have been less extreme. The graphs are seen to corroborate 
perfectly the mathematics of the previous section. 

Ricardo’s fervor for laissez-faire made him balk at endorsing re¬ 
strictions on technical innovations to help labor—as his logic required. 
One straw he grasped for was the possibility that the enhanced total of 
property incomes might get spent in part on more menial servants, so 
the new total L might not be so bad. The diagrams can handle this. 
Suppose one-tenth of the property income area, DEGO and D'E'G'O 
as the case may be, gets spent on labor servants; then form out of that 
area a rectangle with the Os height of unity and add it to these dia¬ 
grams’ respective gross output areas to get correct total gross output. 

Similarly, Ricardo argued that more might get saved after property 
incomes are swelled by invention. Figure lb already shows such an 
effect, but it could be enhanced it we made the new f lower than 
before, with the logical possibility that more servants would survive in 
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the robot epoch than had previously found a living tilling corn. Of 
course, this ultimate induced rise in the demand for labor might be 
offset in the intermediate run by the stipulated adverse substitution 
effects of robots for farm labor—as Ricardo hinted. 

Needless to say, the doctrine is wrong which claims that all inven¬ 
tions that shift resources from circulating capital to fixed capital—to 
durable machines at the expense of “wage funds”— must reduce the 
demand for labor. New diagrams can depict widened wage rectangles 
in Ricardian diagrams such as these that portray inventions of du¬ 
rable machines that are less robot-like. 

Had Ricardo lived long enough to prepare a fourth edition, per¬ 
haps he would have changed his mind to eliminate his evident error 
in believing that more saving must be favorable to the demand for 
labor. Rapid saving will, in the polar robot model, speed up the 
euthanasia and genocide of human labor and accelerate the rise in 
land rent. While such saving raises Ricardo’s net product, it can deci¬ 
mate his gross product—whatever Wicksell’s confusion. 


Transitions 

A determinate dynamic path of reaction to the robot invention might 
be worked out by any reader using the canonical classical model of 
Samuelson (1978). 2 This could be done in a market-clearing context: 
as long as people still exist, they accept the lowered real wage rates 
that the auction market metes out. Alternatively, it would be realistic 
in 1820 to assume that people lose jobs (at least temporarily). The 
usual charge against Ricardo criticizes him for chronically neglecting 
short-run frictions that are realistic. So it is a twist that critics, who 
found unpalatable his chapter 31 conclusion that machines can hurt 
workers and wages, commonly had to discount his new chapter with 
the accusation that he needs to depend on technological unemploy¬ 
ment and other fleeting frictions for his pessimistic results. 

Where is there warrant for this in his chapter 31 text? At most one 


* Equation (3) applies after the invention as long as the transition involves positive L. 
Equation (4) also applies at all times. After the invention that brings robots into exis¬ 
tence, we have the following relations, which are the dynamic versions of (6b) and (6c): 


1 + r, 


f(K, + L,) 

1 - e ’ 


(6b') 


HI, = I - € (6c’) 

for Lt> 0, from (6b') and (3). To eqq. (I), (5), (6b’), and (6c'), we need to add a dynamic 
supply of saving by property owners to have a determinate path to Ricardo’s long-term 
steady state. Note that labor is indeed hurt by the invention even when we and Rtcardo 
obey Say’s law and rule out nonclearing of any market. 
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of its sentences even mentions people losing jobs (and later getting 
new ones). Once I steeped myself in the odd classical subsistence wage 
supply-of-labor mentality. I made perfectly good sense out of the 
sentences in this chapter while never departing from market-clearing 
methodology that would satisfy a devotee of the Lucas school of ra¬ 
tional expectations. 

Discussion 

A few remarks are in order. Ricardo’s laissez-faire fervor made him 
think that a continuous stream of inventions has to be more favorable 
to labor than a single discontinuous one. There is little warrant for 
this, as the present mode of analysis can demonstrate. It is more 
forgivable that Ricardo in 1821 should have erred in this regard than 
that the host of commentators since then have fallen into this shallow 
trap. 

Chapter 31 is often criticized for what I regard as its excellences. In 
it Ricardo admits that he cannot free his theory of distribution of 
income from dependence on how consumers choose to spend their 
incomes. (Example: Peace makes population redundant as the labor- 
intensive demands of the military are replaced by normal demands. 
This sensible result led John Stuart Mill to his fuss concerning the 
demand for commodities not being the demand for labor—a result 
not so much wrong as overdramatized by Mill.) 

In chapter 31 Ricardo makes it clear that he does not assume fixed 
proportions between labor and capital good(s). In this chapter he 
anticipates the induced factor-biased inventing that we associate with 
Marx, Hicks, Fellner, Charles Kennedy, von Weizsacker and Samuel- 
son, Dandrakis and Phelps, and many others: if accumulation raises 
the real wage and lowers the interest rate, the speed of the trend, 
Ricardo perceives, can be lowered by the encouragement it gives to 
the invention of labor-saving, capital-using techniques. In chapter 31 
Ricardo discovers what he has elsewhere gratuitously denied: that an 
improvement abroad can hurt Britain under free trade (or, as needs 
to be said today, that an improvement in Japan can hurt the Ameri¬ 
can living standard). 

I for one find chapter 31a refreshing change from the sterilities 
and nonoptimalities of Ricardo's opening chapters and hope to have 
presented some evidence to support this unfashionable opinion. 3 

* Elsewhere (in Samuelson, in press) I focus on the erroneous conclusions concerning 
Ricardo on machinery by Wicksell ([1923] 1981, 1934), who led many an excellent 
economist down the garden path. Logic aside, Wicksell seems to me more plausible in 
believing that, on the whole, inventions happen to raise real wages than Ricardo is in 
believing that they are likely to be hurtful (or than Hicks [1969] is in hypothesizing a 
harmful bias in the first decades of the nineteenth century). 
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Plan and Market in China's Agricultural 
Commerce 


Terry Sicular 

Harvard University 


This article examines interactions between markets and state com¬ 
mercial planning in the context of China's agricultural sector. It 
begins with a discussion of recent trends in agricultural planning 
and commerce in China and then presents a theoretical model that 
analyzes the way that a mixed commercial system of the sort ob¬ 
served in China functions. The theoretical analysis suggests that a 
mixed system is sustainable and can have desirable efficiency and 
distributional effects. Markets, however, limit the range of sustain¬ 
able plans, and in the presence of markets, state planning may no 
longer direedy influence production and consumption behavior. 


The change in leadership following the death of Mao Zedong 
brought with it a shift in accepted views about the role of markets in a 
socialist system. For much of the previous three decades, Chinese 
leaders believed that markets should play a minimal role. Instead, 
output should revert to the state, which would then arrange for its 
distribution. Domestic commercial policies reflected this view. A large 
proportion of trade was carried out through state channels in ac¬ 
cordance with economic plans. Quotas governed the procurement of 
agricultural and industrial goods, and rationing or direct allocation 
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governed their distribution. Procurement and distribution prices 
were set by the state. 

Since 1978 Chinese leaders have acknowledged that the govern¬ 
ment cannot effectively plan for all allocation. Although planning 
is still considered essential, they believe that markets can allocate 
numerous commodities among numerous economic agents more 
efficiently than state plans. Commercial policies have shifted accord¬ 
ingly, encouraging markets to develop alongside state-planned com¬ 
merce. For example, after 1978 the government gradually loosened 
restrictions on private trade for agricultural products, permitting 
producers to engage in private trade provided they fulfilled their 
delivery quotas. In response to these policy initiatives, market trade 
has expanded rapidly, and China has evolved into a “mixed” eco¬ 
nomic system, that is, a system in which planned and market ex¬ 
change coexist. Chinese economists believe that the mixture of mar¬ 
ket and planning is appropriate to China’s present stage of 
development, and they take pride in what they consider a uniquely 
Chinese approach to socialist economic development. 1 

Despite certain successes, China’s shift toward a mixed system has 
not been smooth. Market prices have been unstable, state price sub¬ 
sidies have ballooned, and plan fulfillment for some commodities has 
become difficult to enforce, while for others government stockpiles 
have grown too large. Certain distributional inequities have also 
emerged. These problems have arisen in part because the Chinese did 
not fully foresee the consequences of introducing market allocation 
alongside the existing planning apparatus and in part because they 
are not sure exactly how the two should be best combined. 

Such problems have led some observers to believe that a mixed 
system is inherently unstable and that sustaining state planning in the 
presence of markets is difficult, if not impossible. The object of this 
paper is to show that plan and market are not inherently at odds. 
Under certain conditions a mixed system of the sort observed in 
China’s agricultural commerce is sustainable, and, furthermore, good 
reasons can exist for maintaining such a mixed system. 

In order to implement a sustainable and desirable mixture, how¬ 
ever, the government must take into account interactions between the 
state plan and markets. Recent trends have demonstrated that the 
emergence of markets can affect the way in which state planning 
functions. Markets place pressures on plans, and they influence the 
economic effects of planning. Similarly, state planning can influence 

1 Such opinions have been expressed in articles by Chinese economists and policy¬ 
makers (see, e.g., Liu 1986; Dai 1986; Xue et ai. 1986). 
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price and quantity trends in markets. In designing a plan or in analyz¬ 
ing the economic effects of a plan, such interactions should be consid- 
ered. 

Past work on China’s economy has not, in general, examined in¬ 
teractions between planning and markets. Most of the literature on 
China’s economy concentrates on state planning and either ignores 
private markets or treats them only superficially. The reasons for this 
are fairly obvious: until recendy litde information about private trade 
was available, and for many years the level of private trade was very 
low. Thus the lopsided emphasis on planning was, for the most part, 
unavoidable. 

Exceptions include Perkins’s classic work Market Control and Plan¬ 
ning in Communist China (1966), which examines developments in the 
1950s and early 1960s as China moved away from a market system. 
This paper is concerned with issues similar to those he discussed but 
focuses on the recent reemergence of markets. A second notable 
exception is the recent research of Byrd (1987), who examines the 
mixed allocation system in Chinese industry. 

This paper examines the sorts of interactions that occur in a mixed 
system such as that observed in Chinese agriculture. The first section 
discusses recent reforms in commercial policies and points out some 
of the problems that have emerged in the mixed commercial environ¬ 
ment. The second section presents a simple, stylized, analytical model 
of the sort of mixed system that existed in Chinese agriculture be¬ 
tween 1979 and 1985. This model clarifies key ways in which plan and 
market interact in such economies. A concluding section discusses 
implications of the analysis for plan design. 

The central conclusions of this analysis are as follows: 

1. Markets place pressure on the state to choose planned prices so 
that they do not deviate too far from market prices. If the state tries to 
set a planned price that exceeds the market price, then the state will 
incur budgetary losses. If state prices are considerably lower than 
market prices, plan evasion can become a problem. In either case, 
plan sustainability may be sacrificed. 

2. In the presence of markets and if quotas are enforceable, state- 
planned prices and quotas cause lump-sum transfers among produc¬ 
ers and consumers and among sectors. They also cause lump-sum 
transfers between the government and nongovernment sectors of the 
economy. Commercial planning can thus serve as an efficient redis¬ 
tributive and tax mechanism. 

3. In the presence of markets, state-planned prices and quotas will 
not, except in certain cases, directly affect levels of production and 
consumption. Except in cases in which the state price or quota is “too 
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high” (to be defined below), producers and consumers will look at 
market prices, not state prices and quotas, in allocating their re* 
sources. 

4. Although state prices and quotas do not affect production and 
consumption directly, they nevertheless influence them indirectly 
through their effects on the distribution of income and equilibrium 
market prices. 

These conclusions point out some of the ways in which markets 
affect planning and planning affects markets. When choosing levels 
of planned prices and quotas, the government will want to take such 
interactions into account. 


Recent Reforms in Commercial Policies 

Reform of policies governing trade in agricultural products began in 
1977 and has passed through two stages. First, between 1977 and 
1982, the state maintained the existing design of state commercial 
planning for major agricultural products but adjusted state-planned 
quotas and prices, permitted markets to revive, and introduced state 
trade at negotiated prices. In the second stage the design of commer¬ 
cial planning was modified. The story of these commercial reforms 
illustrates how the original planning structure became increasingly 
problematic as markets emerged, prompting a second round of re¬ 
forms. 

Prior to and during the first stage of commercial reforms, the state 
set mandatory minimum delivery quotas. For grain and oilseeds, 
these quotas specified minimum absolute quantities to be delivered to 
the state at planned prices. Quota levels were reduced between 1978 
and 1982. Over these 4 years the national grain quota and tax were 
reduced 20 percent (Ministry of Commerce 1984, pp. 386-87), with 
reductions to some extent targeted regionally to benefit low-income 
and disadvantaged areas. 

As before, no maximum limit was set on deliveries to the state: the 
state promised to buy as much as producers wished to sell at state 
prices. The state also continued to distribute these products in a 
planned way, primarily to urban consumers and industrial users. 
Rural areas for the most part did not receive allocations of agricul¬ 
tural products, and those allocations that did occur went primarily to 
areas well suited for production of commercial crops. 

The state paid a fixed price for quota deliveries and a higher, bonus 
price for deliveries beyond the quota. In 1979, state-planned prices 
for quota farm deliveries were raised by more than 20 percent, and 
the percentage price bonus for above-quota deliveries enlarged from 
30 percent to 50 percent for grain and oil-bearing crops. A new 30 
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percent above-quota price bonus was instituted for cotton. State retail 
prices for grain and edible vegetable oils remained at their original 
levels, while retail prices of meats, vegetables, and several other 
nonstaple foodstuffs were increased. 

Policies on free markets were revised in two respects. Local and 
long-distance market exchange were allowed and even encouraged to 
develop. In addition, the number of products that could be ex¬ 
changed in free markets was enlarged. By the early 1980s the state 
permitted market trade in all agricultural products but cotton, and 
this last restriction was later removed. Producers were, however, al¬ 
lowed to sell their products on the market only after they had met 
their delivery quotas. In response to these policies, market trade de¬ 
veloped rapidly. Between 1977 and 1985, the number of markets 
more than doubled, rising from 30,000 to 61,000, and the volume 
of trade more than quadrupled. 2 By 1984 more than 18 percent 
of all purchases of agricultural products took place at market prices 
(table 1). 

Concurrently, the government took steps to make state commerce 
somewhat more responsive to market forces. The first step in this 
direction was to expand the role of “negotiated” state trade. The state 
began to offer negotiated purchase prices agreed on jointly by the 
producer and local state commercial agents for voluntary above-quota 
deliveries to the state. Negotiated purchase prices were to be decided 
on the basis of regional, yearly, seasonal, varietal, and quality consid¬ 
erations and to basically follow supply and demand trends; however, 
they were in general not to exceed local market prices (Wang 1985, 
p. 53). The state commercial organs could then sell goods purchased 
at negotiated prices at negotiated retail prices, which were to be set 
equal to the negotiated purchase price plus reasonable transport and 
handling fees. 

The revival of negotiated price procurement gave the state com¬ 
mercial system more flexibility in responding to market conditions 
and, together with negotiated price sales, was to provide a lever for 
influencing prices in the free market. Moreover, in areas where earn¬ 
ings from new sideline and nonagricultural work opportunities 
threatened to divert labor from agriculture, local officials could use 
the higher negotiated purchase prices to make agriculture competi- 

2 Real growth in the volume of market trade over this period was slightly less than 
what is indicated by these nominal figures since the level of market prices rose during 
this period. Available evidence on market price levels indicates that prices rose a total of 
percent between 1977 and 1984 (State Statistical Bureau, Dept, of Commercial and 
Price Statistics 1984, pp. 395-97; State Statistical Bureau 1985, p. 95). The increases in 
the market price were relatively modest because under the restricted market situation 
prevalent in 1977, free-market prices were already quite high. 
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tive and maintain government procurement. 3 The importance of ne¬ 
gotiated price procurement grew quite rapidly. For grain, negotiated 
price procurement rose from less than 3 percent of net state pur¬ 
chases in 1978 to roughly 17 percent in 1983 (State Statistical Bureau, 
Dept, of Commercial and Price Statistics 1984, pp. 156, 329). 

These policies contributed to rapid growth in levels of agricultural 
production and incomes, but by 1982 certain problems with the de¬ 
sign of commercial planning began to emerge. The higher prices and 
wider scope of bonuses and negotiated prices for above-quota de¬ 
liveries encouraged evasion of quotas. Producers preferred to sell at 
the above-quota, negotiated, and market prices, and they found ways 
to do so. One was to switch from planting crops with relatively high 
quotas to those with low or no quotas. Fields subject to grain quotas 
were converted to cotton or otHer economic crops, and vice versa. 
Another evasion tactic was to save output for 1 or 2 years and then 
deliver it all at once, or for several families to transfer their output to 
one family for delivery to the state. Finally, local officials, under pres¬ 
sure from their neighbors, occasionally succumbed to the temptation 
of carrying out unauthorized reductions in quota levels. Such sorts of 
behavior resulted in underfulfillment of quotas at the same time that 
deliveries at above-quota and negotiated prices increased (table 2) 
(Xu, Chen, and Liang 1982, pp. 121-24, 216—17; Guo and Gu 1983, 
p. 34; Xue 1985, p. 42). Although total deliveries of grain to the state 
grew at more than 10 percent a year, 4 quota fulfillment declined from 
94.6 percent in 1979 to 82.4 percent in 1980 to 80 percent in 1981. 5 
Farmers obviously benefited from these trends, but the state paid 
higher prices for farm products than it had anticipated. 

A second problem was that the procurement system was designed 
to operate in a shortage economy and was ill suited to handle emerg¬ 
ing agricultural surpluses. Under the existing quota system, the gov¬ 
ernment was obliged to buy as much output as farmers wished to sell. 
As surpluses grew, the government found itself committed to buying 
ever-increasing quantities of products at high, above-quota prices. 
This problem became especially severe in the early eighties when 
free-market prices for grain began to fall. The design of the procure¬ 
ment system thus contributed to growing state inventories of grain 


3 Huang (1983, pp. 36-37) discusses use of negotiated pricing for this purpose in 
Guangdong province. 

4 With the exception of 1980, when grain output fell 3 percent and deliveries to the 
state declined by less than 1 percent. 

3 Note that the degree of underfulfillraent increased despite the fact that quota levels 
were being reduced. 
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TABLE 2 

Various Components of State Grain Procurement 


Degree of 

Quota Quota 

Year and Tax* Fulfillment (%) 


Shares (%) of Total Deliveries 
Procured at 


Quota Price Above-Quota or 

(Including Tax Grain) Negotiated Prices 


1955 40.00 

1965 36.25 

1966 36.25 

1967 36.25 

1968 36.25 

1969 36.25 

1970 36.25 

1971 38.25 

1972 37.75 

1973 37.75 

1974 37.75 

1975 37.75 

1976 37.75 

1977 37.75 


1978 

37.75 


68.5 

31.5 

979 

35.00 

94.6 

49.5 

50.5 

980 

54.33 

82.4 

46.5 

53.5 

1981 

30.38 

80.0 

40.0 

60.0 

1982 

30.32 


30.0 

70.0 

1983 



27.6 

72.4 


Soimcu.—Wu (1982. p. 5); Xu cl a). (1982. p. 217); Ministry of Commerce (1984. pp. S86-87); Walker (1984. p. 
82); Wang and Wang (1984. p. SO); Lardy (19SS, p. 31): Wang (198i. p 52). 

* Quota and tax grain arc in trade (husked) grain equivalents, measured in millions of metric tons. 


and further exacerbated state losses on the trade of agricultural prod¬ 
ucts. 6 

An additional related problem also arose in the new surplus envi¬ 
ronment. In the shortage economy that had existed for most of the 
previous two decades, matching supply to demand had never been a 
major concern because people would buy whatever was offered for 
sale. In the early eighties, with growth in incomes leading to increas- 
ng selectivity of demand and with the improved availability of food- 
ituffs, this was no longer true. The state began to find itself holding 
surplus stocks of undesirable commodities and was unable to meet 
:onsumer demand for a variety of higher-quality, nonstaple items. 
The procurement system’s inability to pass on demand signals to pro- 
lucers became increasingly evident with time. 

Finally, the procurement system created inequities. Quota levels 

* A second important factor contributing to losses in state agricultural trade was the 
aversion of state procurement and retail prides. 
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varied among regions, and so the accrual of above-quota price awards 
was unequal. Regions with low quotas were able to sell more at above¬ 
quota and market prices and so received higher average prices, while 
those with higher quotas received lower average prices. For example, 
cotton farmers in the North, especially the Yellow River Basin, faced 
low cotton quotas and were receiving average prices of 200 yuan per 
ton or more from the state. In the southern provinces, such as Hubei, 
quotas were higher, and average state prices were less than 140 yuan 
(Guo and Gu 1983, p. 34). Planned commerce similarly contributed to 
inequality between urban and rural sectors since the urban popula¬ 
tion was the primary beneficiary of low-priced state sales of agricul¬ 
tural products. 

These various problems prompted Chinese planners to carry out 
further commercial reforms. The second-stage reforms were distinct 
from those that had occurred earlier in that their aim was to modify 
the design of procurement planning and greatly reduce the scope of 
planned commerce for agricultural products. Redesign of the state 
procurement system began in 1983 when the government eliminated 
the price distinction between quota and above-quota deliveries for 
oilseeds and began to pay a single price for both quota and above¬ 
quota deliveries. Although the new price for oilseed varied somewhat 
by region and variety, in general it was a weighted average of 40 
percent of the old quota price plus 60 percent of the old above-quota 
price (Wang 1985, p. 52). Similar reforms occurred for cotton in 1984 
(Almanac of China's Economy 1984, pp. IV-50, IX-132; Jiage lilun yu 
shijian, no. 2 (1985], p. 47) and then for grain in 1985. For grain, the 
new price was set equal to 30 percent of the quota price plus 70 
percent of the above-quota price. 7 

In conjunction with these pricing modifications, in early 1985 the 
government announced that, except for a few products, it would no 
longer send down procurement quotas to farmers. For grain and 
cotton, quotas were to be replaced by a program of contract and 
market purchases. State commercial departments were to negotiate 
purchase contracts with farmers before the sowing season and, when 
necessary, carry out supplemental procurement on the free market. 
The prices of these contracts were fixed at the new, weighted prices 
mentioned above. In theory, farmers could choose freely whether or 
not they wished to enter into contracts. Products not under contract 
could be sold on the market or to the state at a low, guaranteed price 
(equal to the old quota price) at harvesttime. The state no longer 

7 /rig* Wun yu shijian, no. 4 (1985), p. 51. This change to a single-price system not only 
eliminated some of the structural problems with the above-quota bonus method but 
also effectively lowered the marginal prices the state paid for above-quota deliveries. 
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promised to buy as much as farmers wished to sell at the higher 
above-quota or contract prices. 

Planned procurement of agricultural products other than grain 
and cotton was to be gradually eliminated and replaced by free- 
market allocation. In general, state commercial departments would 
increasingly buy and sell on the market. Through market participa¬ 
tion, state commercial departments could not only make supplemen¬ 
tary purchases to meet the need for exports and continued planned 
supply to urban areas but also exert influence on free-market trends 
(Jiage lilun yu shijian, no. 4 [1985], p. 51). 

The full effects of the second-stage commercial reforms are still 
unclear. Recent reports suggest that, in practice, the grain contracts 
are not always voluntary and often closely resemble the old procure¬ 
ment quotas except that state procurements are limited to the con¬ 
tract amount (Oi 1986, pp. 284-90; also author interviews). The pro¬ 
gram has probably been successful in reducing state procurements 
and easing state storage and budgetary problems. 8 Lower grain pro¬ 
duction in 1985 and 1986, however, has raised questions about the 
new procurement and price system. 


An Analytical Framework 

The sorts of commercial problems that emerged after 1978 can be 
understood using standard economic tools. For this purpose, a gen¬ 
eral equilibrium-type approach is useful because it permits market 
prices to be determined endogenously given the presence of state 
commercial planning. The model presented below follows in the 
spirit of standard general equilibrium theory. It contains numerous 
producers and consumers who choose levels of production and con¬ 
sumption in response to price levels; markets exist for all goods, and 
market price levels are determined by demand and supply. 

This model differs from the standard general equilibrium ap¬ 
proach in two important ways. First, it includes not only market ex¬ 
change but also state commerce. Producers and consumers can trade 
on markets at market prices or with the state at state-planned prices. 
Second, the model is not, in some sense, completely closed. For any 
set of state prices and quotas, the model will yield market prices that 
set all excess demands in private markets equal to zero. Supply need 
not equal demand, however, in state commercial activities. In equilib¬ 
rium the state can hold unsold inventories and need not have a 

* In 1985 state grain procurement declined by 16 percent, and the government 
budgetary deficit was reduced (Sicular 1986, p. 38a; State Statistical Bureau 1986, pp. 
545, 595). 
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balanced budget. Therefore, in equilibrium overall supply and de¬ 
mand need not balance. Whether or not overall supply and demand 
balance will, as shown below, depend on the levels of state-planned 
prices and quotas. 

The plan is specified exogenously to the model. For any set of 
planned prices and quotas, the model yields equilibrium market¬ 
clearing prices, levels of production, income, and consumption, and 
levels of state stockpiles and net budgetary revenues. This approach 
permits direct analysis of the effects of planning on markets but only 
indirect analysis of the effect of markets on planning. One could 
alternatively make the plan endogenous by specifying the govern¬ 
ment as an optimizing agent that chooses optimal prices and quotas in 
accordance with some objective function. Whether or not the Chinese 
government is, in fact, an optimizing agent is debatable. Suffice it to 
say that the reform process described above portrays a government 
that does not optimize globally but adjusts preexisting state prices and 
quotas from time to time to meet specific goals or when visible prob¬ 
lems arise. Under this sort of adaptive or partial optimization, imbal¬ 
ances emerge. The theoretical approach taken here allows for such 
imbalances. 

The model does not incorporate uncertainty or information prob¬ 
lems, although such problems are undoubtedly relevant. Nor is plan 
evasion explicitly included, although the model provides insights as to 
when and where plan evasion is likely to occur. Further research to 
analyze the effects of uncertainty, information asymmetries, apd eva¬ 
sion would probably yield fruitful results, but in-depth analysis of 
these issues is not within the scope of this paper. 

Related work within the general equilibrium literature includes ar¬ 
ticles that examine equilibrium in the presence of price or quantity 
rigidities, such as Drdze (1975), and general equilibrium in the pres¬ 
ence of Soviet-style planning, such as Feltenstein (1979). For the most 
part, however, the related general equilibrium literature does not 
treat the case in which planned and market allocation exist for the 
same good or goods. Exceptions include Sah and Srinivasan’s (in 
press) examination of the rural food levy in India and Byrd's (1987) 
analysis of plan and market in China's industrial sector. 

The Basic Model 

The model contains two sectors, urban and rural, and three goods. 
The urban sector consists of industrial producers and urban consum¬ 
ers, and the rural sector consists of farm households that both pro¬ 
duce and consume. For simplicity, agents of each type are assumed to 
be identical. This permits the individual agents to be aggregated into 
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an aggregate industrial producer (j = 0), aggregate urban consumer 
(j =1), and aggregate farm household (j = 2). Hereafter agents will 
be referred to as aggregate agents. 

The model contains three goods, a manufactured good (k = 1) 
produced by the industrial producer, an agricultural good (A = 2) 
produced by the aggregate farm household, and a third good (A = 3) 
that is not produced but of which both the urban consumer and the 
farm household hold initial endowments This third good can be 
thought of as human time, land, or some aggregation of various en¬ 
dowed assets. For convenience, it will be referred to here as human 
time or labor/leisure. All three goods can be traded on markets. 

Consumer preferences in both sectors are defined over consump¬ 
tion of the three goods. The utility functions u, = x, 2 > x js)>j - 1 > 

2, are assumed to have the usual properties, that is, to be continuous, 
convex, and monotonic. Preferences of the urban consumer and rural 
household can differ. The urban consumer maximizes its utility sub¬ 
ject to income derived from its endowment. The rural household 
maximizes its utility subject to its endowment income plus net income 
or profits from agricultural production. The industrial producer is 
assumed to maximize profits. 9 

Industrial profits are not distributed to consumers but go directly 
into the state budget. This is in accordance with actual practice in 
China, where, until recently, all industrial profits reverted to the state. 
In fact, industrial and commercial profits have in the past constituted 
the major single source of government revenues. 10 Although reforms 
since 1978 have permitted industrial enterprises to retain a portion of 
their profits, a majority of their earnings still revert to the state. 
Production is given by 

industrial: tfoi = f\qo 2 , ?o»), 

agricultural: q 22 = ? 2 s)- 

The manufactured good is produced using inputs of the agricultural 
good and human time; the agricultural good is produced using inputs 
of the manufactured good and human time. These production func¬ 
tions are assumed to be regular and quasi-concave. 

In the absence of commercial planning, maximization of profits by 
the industrial producer and of utility by the urban consumer and 
rural household would yield the usual supply, input demand, and 
consumer demand functions: 

• Research by Granick (1986) and Wong (1986) suggests that industrial producers in 
China have to some extent been guided by profits, especially in the reform period, but 
still are not simple profit-maximizing agents. For the purposes of this paper, however, 
profit maximization is a useful assumption. 

10 The close link between industrial profits and government revenues also applies in 
certain other socialist economies. 
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supply/input demand functions: q jk = yjk(p), 

urban consumer demand: x ti = d\k(p, p&is), 

rural consumer demand: * 2 * = d^kip, />s© 2 s + Rs(p)), 

where 7 = 0, 2, A = 1, 2, 3, and R? is maximum profits from farming: 

R 2 (p) = fcy^ip) - piyn(p) - psyzaip)- 

Market equilibrium would occur at prices p* = (p f, p*, p*) that set 
excess demands equal to zero for all goods. 


State Planning 

State commercial planning is designed to reflect the structure of ag¬ 
ricultural planning used in China prior to the most recent (1985) 
stage of reforms. The government sets prices p — (pi,p 2 ,p$) and fixes 
ration quotas fj k and procurement quotas 5,*. Ration quotas specify 
maximum state sales of the manufactured good to the urban con¬ 
sumer (?,,) and rural household fot), of the agricultural good to the 
urban consumer (r, 2 ) and urban producer (F 02 ), and of labor time to 
the urban producer (fos). Ration quotas constitute a maximum con¬ 
straint on purchases from the state: agents can purchase less than 
their ration but not more. If ry* represents purchases of good k by 
agent j from the state, the ration quota constraint can be written as 
Tjk £ fj k . The state does not sell manufactured goods to the urban 
producer, agricultural goods to the rural household, or human time 
to either the urban consumer or rural household. 

Similarly, procurement quotas constitute minimum constraints on 
deliveries of goods to the state. Each agent is required to deliver at 
least its quota amount to the state. The state guarantees to purchase as 
much of the goods as agents wish to sell beyond their quotas. Thus if 
sjk represents the actual amount of good k delivered to the state by 
agent j, a procurement quota constraint can be written as z Jj k . 
Procurement quotas are set for sales to the state of the manufactured 
good by the urban producer, of the agricultural good by the rural 
household, and of labor time by the urban household. The state does 
not buy a good from an agent unless that agent has a procurement 
quota for that good. In other words, the state buys good 1 only from 
the urban producer, good 2 only from the rural household, and good 
3 only from the urban household. 

Deliveries to and purchases from the state take place at state- 
planned prices. For simplicity, I assume that there are no transfer 
costs and that for each good the state sales price is set equal to the 
state purchase price. In addition^ I assume that for each particular 
good the sum of procurement quotas is equal to the sum of ration 
quotas: 
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J 0i = ?tl + ^21.^22 ~ *02 + ^12.^13 ~ ?03- 

In other words, the state plans for its purchases to exactly equal its 
sales when all quotas are binding. 11 

As long as they do not violate their quotas, agents are permitted to 
buy and sell freely on the market. Goods purchased on the market 
can be sold to the state, and goods bought from the state can be sold 
on the market. In this model, then, both state and market trade exist 
for all goods. All three agents can trade ail three goods on the market, 
and each agent can trade some, but not all, goods with the state. 


Agent Maximization and Market Equilibrium for a 
Subset of Plan Levels 


For the purpose of exposition, it is useful to first consider a subset of 
all possible state-planned price and quota combinations. Let us define 
the subset U' of planned price and quota combinations as that in 
which, in equilibrium, all state prices are strictly less than market 
prices p': 

U l = {(p, r, s) \p k < p' k for all k}. 


In the presence of such a plan, market prices will guide marginal 
decisions. State prices and quotas will not directly affect levels of 
production and consumption. The plan will, however, affect levels of 
trade on the market and with the state, income distribution, and 
equilibrium market prices. 

Consider first the behavior of the aggregate rural household in the 
presence of such a plan. Its utility maximization problem is as follows: 

max Z -2 = Un(x 2 ) - 02[ps®23 + (pi ~ p 1)^21 ~ (p2 — f>2) s 22 “ Pl *21 

x. r, s, q 

— p2*22 _ Ps*23 + p2 f Z (<{2) ~ p\ ?21 - p3?2s] 

- 02(^21 - r 2 i) - 020*22 ~ $ 22 )- 


First-order conditions of this problem yield 


Suh/Skm _ hf^/hq^h _ p h 
81 / 2 / 8 * 2 * ~ hf/bqn p h 


for all h, k. 


11 1 make these assumptions for notational simplicity. In China the prevalent policy 
has been to set the purchase price higher than the sales price for agricultural products 
and the sales price higher than the purchase price for industrial products. Further¬ 
more, the sums of purchase and sales quotas probably do not all equal zero. Differences 
between state purchase and sales prices and nonzero sums of quotas can be incorpo¬ 
rated into the model quite easily. They will change the equilibrium solution but will not 
qualitatively alter the conclusions. 
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Thus marginal decisions are guided only by market prices. State 
prices and quotas do not directly influence the production and con¬ 
sumption choice. 

The rural household’s supply and demand functions are identical 
to those that would exist in the absence of planning, except that rural 
income now includes an extra component: 

02 A * y?k(p)> 

% 2 a = <hk(p'< p3®23 + R'l(P) + T 2 ). 


Commercial planning changes rural income by a transfer T 2 equal to 
the sum of trade levels with the state times the differences between 
state and market prices: 


Ti — (p\ - pltol — (p 2 - pi)S 22 - 


In fact, in equilibrium jT 2 will be a lump-sum transfer equal to the sum 
of the quota levels times the differences between market and stale 
prices: 

T 2 = (p{ - />l)r 2 i - (p2 - p2)S22- 

This holds because when all state prices are strictly less than equilib¬ 
rium market prices, then all quotas are strictly binding. For goods 
purchased from the state, if the state price is lower than the market 
price, the consumer will buy as much as it can, the full amount of the 
ration quota, from the state and then sell whatever it does not con¬ 
sume at the higher market price. For goods sold to the state, if the 
state price is lower than the market price, the consumer will sell as 
little as possible, the quota amount, to the state. 

The microeconomic behavior of the urban producer and consumer 
is similar to that of the rural household. For the urban producer we 
have 


max R 0 — pif l (qo) ~ /> 2 fo 2 ~ ps<jm - (p\ ~ p i)$oi 

r, J. 9 

+ (p2 ~ P2V02 + (ps ~ ps) r m ~ oofaoi “ ibi) 

~ Po(^02 “ r 02 ) - (To( ? os — ros). 

The urban producer’s supply and input demand functions are the 
same as those that would exist in the absence of state planning, fat - 
>o*(/>)• In equilibrium its maximum net income will equal maximum 
profits plus the lump-sum transfer 7\>: 

T 0 = — ( p{ - piFai + tp2 ~ ps)ro2 + (pi ~ psftoi- 
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The urban consumer’s maximization problem is 
maxL{ = «i(*i) — 6 j[^s!T'i 3 + (p\ ~ pifrn + (p2 ~ p2) r 12 

n j 

~ ( P s ~ p»)s is ~ pi*n _ pa* 12 — ps*is] - «i(?n - r M ) 
- 3l(*‘l2 “ ^ii) — O'K'SlS ~ *is)- 

Like the rural household, it sets the ratios of marginal utilities equal to 
the ratios of market prices. Urban consumer demands are 

*u ® d(*(p; p s ID, 3 + T 1 ), 

where urban income includes a transfer that in equilibrium equals 
T\ ~ (pi ~ p\W\i + ( p'i * p'i)^\2 ~ (pi ~ pi)^n- 

Equilibrium in this mixed plan and market environment occurs 
when excess demands on the market equal zero. Each agent’s excess 
demand on the market will be a function of market prices and will 
equal total excess demand plus (or minus) the relevant quota. Equilib¬ 
rium market prices are therefore those that solve the following set of 
market-clearing equations: 

0 - ~yoi(p) + dii(p;p 3 a>i 8 + Ti) + >21 (p) 

+ d 2 l(p; p 3»>23 + T 2 + R-i) + *01 ~ ^11 “ ? 21 » 

0 = yo2(p) + d I2 (p; P9W13 + T t ) - >22 (p) 

+ dz2(p; piW ‘23 + T 2 + R 2 ) + J22 — r«2 - f\ 2, 

0 = yoi(p) + d\s(p; p 3 zt) 13 + T t ) — w 13 + >23 (p) 

+ d 23 (p\ p$w 2 3 + T 2 + R<i) - a>2s + *13 — *oi- 

Under the assumption that the state chooses quota levels so that for 
sach good the sum of procurement and delivery quotas equals zero, 
he last terms in these equations drop out. Markets clear, therefore, at 
orices p' — (pi, p 2 , pi) identical to the equilibrium prices that would 
occur if the state did not engage in commercial planning but imple¬ 
mented lump-sum transfers T } . In other words, when the state plan 
belongs to U l , the mixed plan and market equilibrium is equivalent to 
that in a pure market economy with lump-sum transfers 7). 

In summary, if a plan belongs to U l , then state-planned prices and 
uotas will not directly affect levels of production and consumption, 
hey can have an indirect effect on these variables, however, because 

** If state procurement and ration quotas did not sum to zero, then market-clearing 
nices would equal those that would occur if the government carried out lump-sum 
-ansfers and injected (or subtracted) some Axed quantity of output onto (from) the 
narket. 
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they change incomes by lump-sum transfers 7} and so can alter equi¬ 
librium market prices. Furthermore, quotas will directly affect each 
agent’s levels of trade with the state and on the market. Levels of 
trade with the state will exactly equal quota levels, and market trade 
for each agent will equal its excess demand minus the ration quota, or 
plus the procurement quota. 

These conclusions will hold no matter how high the state sets 
quotas. State quotas can, in fact, exceed production levels without 
direcdy influencing production. This is possible because producers 
can meet their quotas by either producing or buying on the market. If 
the producer buys on the market, which it will if the quota level is so 
high that the marginal cost of production exceeds the market price, 
then it will buy from consumers. The consumers in turn will have 
obtained their goods from the state, which originally purchased them 
from the producers. Thus a high quota can be met simply by the 
circulation of goods between the producer, state, and consumer and 
then back to the producer, which once again delivers them to the 
state. 


Maximization and Market Equilibrium When State Prices 
Are Too High 

Now let us consider another possible subset of planned price and 
quota combinations, specifically, that set for which one or more state- 
planned prices are greater than or equal to the market price. The set 
U 2 of state plans in which one or more state prices are “too high” is 
defined as 

U 2 - {(p, f, T)jp* a p* for one or more k\. 

On closer inspection, it becomes apparent that a state price can 
equal, but not exceed, the equilibrium market price. This can be 
shown using a reductio argument. Suppose the state sets pi > p[. Then 
industrial producers could increase their income by buying good 1 for 
a low price on the market and selling it to the state for a higher price. 
They would, in fact, continue to buy low and sell high as long as the 
market price was less than the state price. Such activity would drive up 
the market price until it exacdy equaled the state price. The same 
argument holds for goods 2 and 3. Consequently, in equilibrium, 
state prices can be less than or equal to, but not greater than, market 
prices. 

The argument above reveals that if a state price is “too high,” the 
quotas associated with that good may no longer be binding. For such a 
good, agents with procurement quotas are likely to sell more than 
their quotas to the state, and ageifts with ration quotas are likely to 
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buy less than their quota amounts from the state. In either event, the 
state will buy more than it sells and so end up holding unsold inven¬ 
tories. 

State planning when a price is too high will, as for state planning 
when all state prices are strictly less than market prices, continue to 
cause lump-sum transfers among agents. For any good whose market 
price exceeds its state price, the quotas will be binding, and so the 
income transfer term is as above. For any good whose state price is too 
high, the state price must equal the market price, and so the income 
transfers associated with such goods will be zero. Regardless, then, the 
income transfer will equal the sum of quota levels multiplied by the 
differences between state and market prices. 

When one or more state prices are too high, therefore, in equilib¬ 
rium all state prices must be less than or equal to their corresponding 
equilibrium market prices. For goods whose state prices are strictly 
less than their market prices, all quotas are binding and state procure¬ 
ment will exactly equal state sales. For any good whose state price is 
too high, the market price will be driven up to equal the state price. 
Markets clear at this high market price because the excess of market 
supply over demand is absorbed by state procurement. 


Income Transfers in the Mixed System 

The extent to which the distribution of income in the mixed plan and 
market environment differs from that in the pure market environ¬ 
ment depends both on equilibrium levels of the transfers T } and on 
the difference between equilibrium prices without planning p* and 
with planning p'. The total changes in income between the pure mar¬ 
ket and the mixed plan and market situations equal the sum of these 
transfers plus the changes in income attributable to differences be¬ 
tween equilibrium market prices with and without planning: 

Y6 - YS = To, 

y’i - yT - r, + (pi - phw ls . 

Yi - y$ = r 2 + (pi - pt)wn. 

Profit terms do not enter these equations because competition drives 
profits from production to zero in both the pure market and the 
mixed plan and market environment. 

These expressions indicate that state commercial planning can 
change incomes by more or less than the lump-sum transfer. For 
example, even if the state sets prices and quotas so as to effect a 
negative lump-sum transfer from the rural sector, that is, so that 
T 2 < 0, rural income need not be reduced from its pure market level. 
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uch a planning strategy will not reduce rural income if the imposi- 
ion of planning raises the price of good 3 by enough to offset the 
legative lump-sum transfer. 

The conclusion that mandatory agricultural procurement at low 
irices can raise rather than lower rural income has been discussed 
:lsewhere. Millar’s (1974) and Ellman’s (1975) empirical studies of the 
itroduction of mandatory procurement in the USSR during the First 
7 ive-Year Plan (late 1920s and early 1930s), for example, conclude 
hat reductions in rural income due to low state prices were offset by 
ligher farm profits following a rise in the free-market prices for 
igricultural goods. The tax burden was as a result shifted to the 
irban residents who purchased farm products in free markets. Ha- 
ami, Subbarao, and Otsuka (1982) reach a similar conclusion with 
sspect to state mandatory procurement of cereals in India. The con¬ 
fusion here differs somewhat from those of these other studies in 
lat the increased market prices enhance rural incomes by raising the 
alue of the rural endowment, and not by raising farm profits. Farm 
rofits will not increase even if the market prices of agricultural goods 
ise because in equilibrium competition drives profits to zero. 


Equilibrium and Plan Sustainability 

Tie question of plan sustainability in a mixed plan and market econ- 
>my would be addressed more fully by a dynamic model; still, the 
atic framework employed here sheds light on the issue. Plan sus- 
ainability in this framework depends on two factors: the extent of 
tate budgetary losses due to commercial planning and the extent of 
>lan evasion. In practice, state budgetary losses from commercial 
lanning can be sustained if the government has sources of revenue 
'ther than those discussed here. Such losses could be financed, for 
xample, by income or excise taxes borne by urban consumers and 
ural households, by sales of surplus state stockpiles in international 
narkets, by foreign aid, or by the issuing of money. International 
emand and the availability of foreign aid, however, are not particu- 
arly reliable sources of revenues, while other taxes and the issuing of 
noney can have negative distributive, inflationary, and efficiency con- 
jquences. Large budget losses due to commercial planning, then, are 
ikely to be difficult to sustain and so ultimately will induce plan revi- 
ions or reform. Such, in fact, has been the case in China during the 
ecent reform period. 

In the model presented here, total state net budgetary earnings due 
3 commercial planning have two components. The first is the income 
ransfer to industry To, which goes^ into the state budget. The second 
> net earnings C in state commercial activities: 
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C = p\(r\\ + r 2 ] — Soi) + 12 + r 02 ~ ^ 22 ) + p$(?as — sis). 

where C equals revenues on state sales minus expenditures on pro¬ 
curements, Total net state budgetary earnings due to commercial 
planning fl are thus given by B - T 0 + C. 

Since state purchase and sales prices are equal and the government 
cannot sell more of a good than it has purchased, net commercial 
revenues C must be less than or equal to zero. Net commercial reve¬ 
nues, in other words, cannot show a strict surplus. They will be nega¬ 
tive if the state sells less of any good than it has purchased, that is, if it 
holds unsold inventories. Such is likely to be the case if one or more 
state prices are too high. 

The size of the transfer to industry T 0 will depend on quota levels 
and the differences between state and market prices. The transfer will 
be larger when the state prices of the agricultural good and human 
time are low, and of the industrial good high, relative to market 
prices. Thus the state can earn revenues by undervaluing agricultural 
output and maintaining low state wages. Such, in fact, has been gov¬ 
ernment policy since the 1950s. 13 Similarly, the transfer will be larger 
if industry’s delivery quota is low and its ration allocations are high. 

The state will suffer a deficit from its commercial planning if the 
transfer to industry is negative or if it is positive but insufficiently 
large to offset losses in commercial activities. Such a deficit is probable 
when state and market prices are close together, and especially when 
state prices for the agricultural good or human time are too high. In 
this situation not only will the state lose money in its trade of these 
items, but also the nonnegative terms of industrial transfer associated 
with its ration quotas will equal zero. 

The marginal income gain from quota evasion equals the differ- 
;nce between the market and state prices. The utility gains to consum- 
srs from evasion depend further on the marginal utilities of their 
income. Consider, for example, the rural household’s marginal gain 
Tom evading its procurement quota. First-order conditions tell us 
.hat the marginal utility from an incremental change in the quota 
equals 

p 2 * e 2 (# - fa). 

Thus the incentive for the rural sector to evade is greater when the 
difference between the state and market prices is large and when 
rural income is low (assuming that the marginal utility of income is 
inversely related to the level of income). 

In the urban sector, similar temptations exist to evade the ration 

19 See Skular (1986, pp. 36-41) for a disco as ion of China's use of prices for implicit 
taxation. 
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uota, that is, to try to buy more than the rationed amount of good 2 . 
f urban income is higher than rural income, however, the marginal 
itility of evasion will be lower than that in the rural sector. Further¬ 
more, evasion of a ration quota requires that a state official in charge 
~ planned distribution be willing to sell under the counter. In gen- 
al, then, problems of evasion are more likely to emerge for quota 
snstraints applying to the rural (lower-income) sector and for pro- 
irement rather than ration quotas. 

Problems of plan sustainability can arise, then, when state prices are 
it either too low or too high. If the former, agents will be tempted to 
vade plans. If the latter, the state might hold unsold inventories and 
icur losses on its commercial activities. The standard by which prices 
*e judged to be low or high is the market equilibrium prices p\ 


he Effects of Changes in State Prices and Quotas 


'he effects of a change in the state plan on market prices and income 
istribution are ambiguous . 14 Equilibrium market prices can increase 
r decrease when the government alters a planned price; similarly, 
market prices can increase or decrease when the state alters a quota 
*vel. Whether equilibrium market prices increase or decrease de- 
ends on the signs of fairly complicated expressions containing the 
erivatives of sectoral demands and supplies with respect to incomes 
nd prices. 

The lack of straightforward comparative statics results means that 
re effects of changes in planned prices and quotas are difficult to 
redict and could contradict the planners’ intentions. For example, if 
lanners wished to raise rural incomes, they might consider increas- 
tg the state price of the agricultural good. However, since market 
•ices can either increase or decrease with p 2 , an increase in pa could 
mnceivably reduce rural income. This can be seen mathematically 
*om the expression for a change in rural income given a change in 
me planned price: 


8F 2 

hf>2 



dpydp 2 = 0 for all k, then an increase in pa would have an unambig- 
ously positive effect on rural income. A rise in the state price, how- 
ver, could cause the market price for good 1 or 3 to fall or the 
tarket price for good 2 to rise, if these changes in market prices were 
jfficiently large, rural income would decline. Whether rural income 


u The mathematical derivation of the comparative statics results discussed here 
11 be provided by the author on request 
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ultimately increases or decreases will depend on how market prices 
change with the state price and on the levels of the rural household’s 
quotas and endowments. 

Similarly, rural income can rise or fall with a change in the procure¬ 
ment quota level: 


SK 2 

Sf22 


(p2 — p£) 
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dp[ 

dS22 


?2l 


dpi 7 

—- S22 
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dp j 
dS22 


® 29 - 


If market equilibrium prices do not change with J 22 and if the market 
price exceeds the state price, then a reduction in the quota will unam¬ 
biguously raise rural income. However, if the adjustment in the quota 
causes sufficiently large offsetting changes in p{, pi, or pi, rural in¬ 
come would fall. 


Conclusions 

Recent developments in China's agricultural commerce can be under¬ 
stood in terms of the theoretical analysis above. The oversupply of 
grain to the state and the rapid accumulation of unsold state grain 
stockpiles in the early eighties occurred because the state above-quota 
price for grain became too high. State prices may not have been too 
high initially, but they became so as decollectivization and increased 
scope for regional specialization shifted the grain supply curve out¬ 
ward, thus depressing market prices. Price seasonality, in particular, 
low market prices at harvesttime, the most active season for state 
procurement, reinforced these problems. 

The theoretical analysis also sheds light on other aspects of China’s 
recent commercial experience. With the convergence of market to 
state above-quota prices, losses on state trade grew. Government reve¬ 
nues from industry were also shrinking because of concurrent re¬ 
forms permitting enterprises to retain some of their profits and be¬ 
cause changes in state and market prices had reduced the lump-sum 
transfer going to industry. These factors contributed to increasing 
budgetary deficits. 

In the new environment income transfers associated with procure¬ 
ment and ration quotas became more noticeable, and the resulting 
inequities began to cause concern. In addition, quota evasion grew to 
unprecedented levels. Such were the consequences of the widened 
differential between quota prices and the prices received on the mar¬ 
ket or for above-quota deliveries. It is not surprising that in the wake 
of these developments the government chose to eliminate the two-tier 
price structure and set a cap on procurement of farm products. 

The theoretical analysis above yields several general conclusions 
about the nature of a mixed plan and market system of the sort 
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□bserved in China. The first is that state commercial planning need 
not cause inefficiency or a Pareto-inferior allocation of resources. A 
□lan in which all state prices are lower than equilibrium market prices 
and in which the transfer to industry To is zero (or is used in an 
optimal way) can achieve a Pareto-optimal equilibrium allocation of 
-esources. Furthermore, since planning causes lump-sum transfers 
among agents and between agents and the state, the government can 
use planning as an efficient means to pursue distributional or tax 
objectives. Adjustments in state prices and quotas affect not only the 
ize of transfers but also equilibrium market prices; therefore, plan¬ 
ners should understand that resulting income changes will include 
not only changes in the lump-sum transfers but also changes in the 
values of endowments caused by shifts in equilibrium market prices. 

In a mixed system, plan and market interact in several ways. State 
□rices and quotas affect equilibrium market prices and quantities 
:raded privately. Similarly, markets influence the operation of plan¬ 
ning. If state-planned prices are too low or too high relative to equilib¬ 
rium market prices, evasion or budgetary deficits may erode plan 
sustainability. Furthermore, whenever market prices exceed state 
□rices, state prices and quotas will no longer directly influence pro- 
iuction and consumption. 

Allowing markets to emerge alongside state planning, then, both 
mhances and limits the power of planning. Markets enhance plan- 
ling’s distributional function and improve allocational efficiency in 
he mixed economy. Yet markets also limit the state's ability to direct 
□reduction and consumption by means of planned quotas and prices. 
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In his well-known analysis of the national debt, Robert Barro in¬ 
troduced the notion of a “dynastic family." This notion has since 
become a standard research tool, particularly in the areas of public 
finance and macroeconomics. In this paper, we critique the assump¬ 
tions on which the dynastic model is predicated and argue that this 
framework is not a suitable abstraction in contexts in which the ob¬ 
jective is to analyze the effects of public policies. We reach this con¬ 
clusion by formally considering a world in which each generation 
consists of a large number of distinct individuals as opposed to one 
representative individual. We point out that family linkages form 
complex networks, in which each individual may belong to many 
dynastic groupings. The resulting proliferation of linkages between 
families gives rise to a host of neutrality results, including the irrele¬ 
vance of all public redistributions, distortionary taxes, and prices. 
Since these results are not at all descriptive of the real world, we 
conclude that, in some fundamental sense, the world is not even 
approximately dynastic. These observations call into question all 
policy-related results based on the dynastic framework, including 
the Ricardian equivalence hypothesis. 


Introduction 

Dver the last decade, there has been a growing awareness that many 
important public policy issues turn critically on the assumed nature of 
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economic relationships within the family. This awareness is largely 
attributable to seminal papers by Barro (1974) and Becker (1974). 
Barro’s paper ostensibly concerns the national debt, but its implica¬ 
tions are much more far-reaching. Specifically, Barro supplemented 
the traditional overlapping generations model with intergenerational 
altruism and argued, in essence, that voluntary transfers between 
parents and children cause the representative “dynastic" family to 
behave as though it is a single, infinite-lived individual. Policies that 
fail to affect the family’s real opportunities are neutralized through 
private actions; thus Ricardian equivalence and related propositions 
(concerning the irrelevance of government debt and social security) 
follow directly. 1 The dynastic family model has since become a stan¬ 
dard research tool, particularly in the areas of public finance and 
macroeconomics (see, e.g., Chamley 1981; Abel 1984; Judd 1985). 2 

In this paper, we critique the dynastic family as a modeling tool and 
argue that it is not a suitable abstraction in contexts in which the 
objective is to analyze the effects of various public policies. Our criti¬ 
cism differs fundamentally from those offered by previous commen¬ 
tators (see, e.g., Feldstein 1976; Tobin and Buiter 1980) in that we 
assail neither the logic nor the assumptions employed by studies that 
invoke the dynastic framework. Rather, we take these at face value 
and show that they lead to untenable conclusions. 

We reach these conclusions by formally considering a world in 
which each generation consists of a large number of distinct individ¬ 
uals as opposed to one “representative” individual. While the notion 
of a representative consumer is always somewhat objectionable, here 
it is especially pernicious in that it obscures considerations arising 
from the biological structure of families. For the human species, 
propagation requires the participation of two traditionally unrelated 
individuals. Thus family linkages form complex networks in which 


1 While Barro's paper has been a centerpiece of the debate concerning the neutrality 
of national deficits, the earliest modern reference to the Ricardian proposition seems to 
be Bailey (1962). 

* No doubt its popularity in part reflects considerations of analytic convenience. 
Overlapping-generations models not only are generally less tractable but also often give 
rise to equilibria with undesirable properties. Specifically, equilibria in overlapping- 
generations models may fail to be either efficient or locally unique (see Balasko and 
Shell 1980; Kehoe and Levine 1985). Failure of local uniqueness is particularly trou¬ 
bling in any exercise involving comparative statics or dynamics. Thus Judd (1985) 
unabashedly attributes his adoption of the dynastic framework to analytic convenience. 
Unfortunately, this advantage may be illusory. In a recent paper. Gale (1985) has 
pointed out that, while Barro’s dynastic solution is an equilibrium for the model he 
considers, this model also generally gives rise to a continuum of subgame-perfect in¬ 
tergenerational equilibria (see Selien 1965, 1975), many of which are inefficient. By 
adopting dynastic assumptions, one therefore does not necessarily succeed m avoiding 
the problems that arise in the standard overlapping-generations framework. 
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each individual may belong to many dynastic groupings. In this pa¬ 
per, we argue that the resulting proliferation of linkages between 
families gives rise to incomparably stronger neutrality properties 
under maker conditions than those imposed by Barro. In particular, 
no government transfer (including those between unrelated members 
of the same generation) has any real effect, and ail tax instruments 
(including so-called distortionary taxes) are equivalent to lump-sum 
taxes. In essence, the government can affect the allocation of real 
resources only by altering real expenditures. The efficiency role of 
government is thus severely limited, and the distributional role is 
entirely eliminated. More generally, we argue that if all linkages be¬ 
tween parents and children are truly operative, then market prices 
play no role in the resource allocation process: the distribution of all 
goods is determined by the nature of intergenerational altruism. 

If taken literally, these results would have profound implications 
for the study of economics. We hardly intend to suggest that such 
extreme conclusions are warranted. Rather, when results stretch the 
bounds of credibility past the breaking point, it is natural to question 
the validity of underlying assumptions. We must therefore emphasize 
that we have obtained these results under relatively weak conditions 
and that these same conditions are the fundamental building blocks 
of the dynastic model. Thus refusal to accept the practical implica¬ 
tions of our results is tantamount to a rejection of the dynastic frame¬ 
work and calls into serious question the results (such as Ricardian 
equivalence) that follow from it. Further analyses of economic 
policies, such as the effects of government debt, require us to specify 
quite explicitly which of the underlying assumptions fails and how it 
fails. 

The paper is organized as follows. In Section II, we consider some 
simple examples that illustrate the principles driving our neutrality 
results. Section III presents the general model. In Section IV, we 
discuss the notion of operative private transfers and argue that 
parent-child linkages are alone sufficient to interconnect virtually the 
entire population. Section V contains the central neutrality theorem. 
We describe various extensions and qualifications in Section VI. Fi¬ 
nally, in Section VII, we clarify the nature of our results, reexamine 
the central assumptions, and consider various interpretations. 


II. Examples 

The linear structure of “dynastic” families in Barro (1974) allows him 
to model a family as, essentially, a single, infinite-lived consumer with 
dynamically consistent preferences. In particular, he specifies the 
well-being of generation t (u t ) as a function of t’s own consumption 
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and the utility of t's immediate successor: u, — v,(c„ u t+ \). Popular 
intuitive explanations of Ricardian equivalence are closely tied to this 
formulation: since the dynastic family chooses an optimal program, it 
will simply offset any exogenous, intertemporal redistributions of re¬ 
sources that might displace it from the optimum. 9 

In practice, families are not independent linear units but rather 
complex, interlocking networks, in which unrelated individuals share 
common descendants. As a result, it is in general impossible to repre¬ 
sent any particular family (or set of families) as a single, utility- 
maximizing agent, even when the well-being of each individual is 
assumed to depend only on his own consumption and the well-being 
of his children, as above. i * * 4 It is perhaps somewhat surprising that this 
observation does not invalidate Ricardian equivalence. As we shall 
see, neutralization of public transfers depends only on the existence 
of operative, altruistically motivated private transfers, and not on any 
particular pattern of linkages or specification of preferences. Yet 
therein lies the difficulty. For if, as we argue in Section IV, virtually all 
the population is interconnected through chains involving parent- 
child linkages, then Ricardian equivalence is merely one mani¬ 
festation of a much more powerful and implausible neutrality 
theorem. 

We begin our analysis with two simple examples that serve to illus¬ 
trate basic concepts. 


Example 1 

Suppose that there are three individuals, 1, 2, and 3. Individuals 1 
and 2 have quasi-concave preferences of the form u,(c,, c s ), i = 1, 2, 
while 3’s preferences are simply uj(c 9 ). We may think of 3 as a com¬ 
mon descendant of 1 and 2, who are unrelated. Each consumer i is 
endowed with wealth a/,. Individuals 1 and 2 divide this wealth be¬ 
tween own consumption c, and a nonnegative transfer to 3 ,b„i - 1,2. 
Individual 3 consumes + b\ + b 2 . 

There are, of course, a variety of ways in which 1 and 2 might 
determine the magnitude of their transfers to 3. For the purpose of 


i Barro (1974, p. 1097) explains that “current generations act effectively as though 

they were infinite-lived when they are connected to future generations by a chain of 
operative intergcnerational transfers.” Subsequent papers reinforced the notion that 
Ricardian equivalence is somehow tied to the dynamically consistent formulation of 

family preferences; see, e.g., Butter and Carmichael's (1984) dispute with Burbidge 
(1983, 1984). 

4 Equilibria are quite generally inefficient in models with interlocking families. One 
important reason has been emphasized by Nerlove, Razin, and Sadka (1984): when 
unrelated individuals share a common descendant, the consumption of that descendant 
is a public good. 
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this illustration, we will assume that the exogenous environment is 
such that 1 and 2 must make simultaneous, noncooperative choices. 
Accordingly, it is perhaps most natural to consider Nash equilibria in 
transfers to 3. Suppose that there exists a Nash equilibrium in which 1 
and 2 both make positive transfers. The reader may easily verify 
through a direct argument that private transfers will then neutralize 
the effects of all sufficiently small lump-sum redistributional policies, 5 
despite the fact that this extended family does not act as a single, 
utility-maximizing individual. 6 Throughout this paper, we use a more 
powerful but less direct line of argument, which works as follows. 

We have described an environment in which two agents, 1 and 2, 
play a simple game. Each agent chooses an action (transfer) A, subject 
to the constraint A, is 0 and receives a payoff of - A,, w$ + A, + 
Ay). By transferring z from 1 to 2, the government alters this game as 
follows. Each agent still chooses an action A, subject to the constraint A, 
S 0 but receives different payoffs: tti(u/| — z — Ai, a/j + b\ + A> 2 ) and 
m 2 (w 2 + z — A 2i Ws + A, + A 2 ). 

Now we introduce the following change of variables: (3, = A ( + z 
and 0 2 = A 2 — z. That is, we think of agent 1 (2) as choosing 0i (P 2 ) 
subject to the constraint Pi i z (0 2 =£ -z) and receiving payoffs 
u,(w, - p„ w s + pi + 0 7 ). Note that this differs from the original game 
in only two respects. First, the same abstract action has a different 
practical interpretation in each case. For example, we associate the 
choice “Ai = 5” with the interpretative label “1 transfers $5 to 3.” 
while we associate “Pi = 5” with the label “1 transfers $5 - z to 3.” 
Since all standard solution concepts have the property that “strate¬ 
gically equivalent” 7 games give rise to equivalent sets of equilibria, 
changing the interpretations of abstract actions is inconsequential. 8 

5 In fact, one can think of agent 3 as a public project financed by voluntary contribu¬ 
tions, in which case the analysis of Bergstrom, Blume, and Varian (1984) establishes the 
neutrality result. We are aware that Laurence Kotlikoff also derived this result inde¬ 
pendently. These authors did not, however, note the strategic equivalence of the pre- 
and posttransfer games (which makes the result substantially more general), nor did 
they discover the neutrality of so-called distortionary taxes (and the implications, dis¬ 
cussed in Sec. VI, for the role of prices in resource allocation), in addiuon. the frame¬ 
work employed by Bergstrom et al. is substantially more restrictive than the genera) 
model considered in Sec. Ill (their neutrality result was established for networks of 
interpersonal linkages with a very specific structure). We also note that Carmichael 
(1982) had previously recognized that the key to Barro’s theorem concerns the preser¬ 
vation of opportunity sets rather than the specification of altruism. 

6 In this example, equilibria are generally inefficient because of the public good 
problem noted in n. 4. 

7 Two games are strategically equivalent if they have the same extensive forms. 

* Our central result depends critically on the assumption that strategically equivalent 
games yield equivalent equilibria. Essentially, this implies that we can think entirely in 
terms of abstract actions, ignoring the primitive actions to which these actually corre¬ 
spond. Yet in many situations, primitive actions may play a role in determining behav- 
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Second, agents’ opportunity sets differ between the two games. Since 
this difference is potentially substantive, we conclude that private 
actions neutralize the effects of government transfer policies as long 
as the original equilibrium is insensitive to perturbations in the agents’ 
constraints. This simple condition is, in principle, easily verifiable. 
Suppose, for example, that we have an initial Nash equilibrium with 
h> 0, i — 1,2. Under the assumption that utility is quasi-concave, 
equilibrium behavior is insensitive to small perturbations of the con¬ 
straints, so neutrality follows as an immediate corollary. 9 

This alternative line of reasoning also allows us to conclude that 
neutrality will hold for a wide range of solution concepts. 10 Indeed, it 
is natural to expect that interior equilibria will typically satisfy the 
basic condition unless the corner constraints play a special role in 
defining the relevant solution concept. Suppose, for example, that 1 
and 2 can form binding contracts so that transfers are determined 
through bargaining. If the relevant threat point is b] - b 2 = 0, then 
the Nash bargaining solution will not give rise to neutralizing behav¬ 
ior. However, if a breakdown in negotiations is followed by non- 
cooperative behavior (so that a noncooperative Nash equilibrium pre¬ 
vails) and if this threat point entails positive transfers, then the basic 
condition will generally hold, and private actions will neutralize small 
transfer policies. 

While it would be interesting to identify more primitive exogenous 
conditions under which linkages are operative (in the sense that trans¬ 
fers are positive and equilibria are robust with respect to perturba¬ 
tions of corner constraints), this is not our current objective. Instead, 
we critique the dynastic model by taking its central premises at face 
value, thereby implicitly restricting attention to the set of environ¬ 
ments that give rise to operative linkages. 


ior. The most obvious role arises when the game yields multiple equilibria: players may 
gravitate toward equilibria in which their choices are close to certain focal alternatives 
(e.g., zero transfers or transfers prior to the policy perturbation). It is, however, 
difficult to see how this possibility could invalidate our result without simultaneously 
rendering the dynastic framework inapplicable. 

* Quasi concavity does not play a significant role in establishing this result. As long 
as u,' is continuous and « strictly prefers b, to zero, the basic robustness condition is satis¬ 
fied. Since indifference between fi, and zero is an extremely unlikely outcome (formally, 
one can show that it is a measure zero event in the space of potential preferences), the 
existence of an equilibrium with positive transfers is generally sufficient to guarantee 
that private actions will neutralize sufficiently small government transfer policies. 

10 Consider, e.g., refinements of the Nash equilibrium concept. As long as an interior 
Nash equilibrium strictly satisfies (violates) certain refined criteria, small perturbations 
of the corner constraints will not generally cause it to violate (satisfy) these criteria. 
Indeed, the purpose of many refinements is to rule out equilibria that are sensitive to 
perturbations either in the environment or in the rules governing behavior (see Fuden- 
oerg, Kreps, and Levine 1986). 
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Example 2 

Suppose that everything is as in example 1, except that individual 1 
chooses labor supply (l\) prior to the choices of consumption and 
transfers. Further assume that his utility is given by U](ci, e 3 , l\) and 
that his wealth is wi = to? + wl\ - z(l\), where wf is nonlabor income, 
w is the wage rate, and z is a tax schedule, used for redistributing 
wealth from 1 to 2. Thus 2’s wealth is w 2 = w 2 + z(l j). 

The extensive form of this game is represented schematically in 
figure la. First, 1 chooses his labor supply; then 1 and 2 play a “simul¬ 
taneous move-transfer game,” as in example 1. Thus if 1 chooses 
labor supply /{, 1 and 2 play a transfer game in which their endow¬ 
ments are wf + wl\ - z(l{) and w§ + z(/{), respectively (this game is 
denoted G t ). Similarly, if 1 chooses labor supply if, 1 and 2 play the 
corresponding transfer game G 2 . 

Suppose that the government contemplates an arbitrary change in 
the tax-transfer schedule from z to z'. At first, this may appear to alter 

(a) 


I't labor 
choica 



(b) 


I't labor 
choica 



Fic. 1.—Schematic representation of example 2. a. Policy z. b, Policy t' 
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the game in a fundamental way. For instance, when 1 chooses /{, this 
induces a simultaneous move-transfer game between 1 and 2, where 
endowments are now w\ + tui} - z'(/{) and ui 2 + z'(lj), respectively 
(this game is denoted G\ in fig. lb). Yet the total resources of 1 and 2 
are identical in Gi and G\. Thus by the argument given in example 1, 
if we appropriately perturb the corner constraints in G t , we obtain a 
game that is strategically equivalent to G\. Clearly, this same rea¬ 
soning applies regardless of l’s labor supply decision (for instance, an 
appropriate perturbation in the corner constraints of G 2 yields a game 
that is strategically equivalent to G 2 ). Consequently, the only substan¬ 
tive difference between these two environments consists of perturba¬ 
tions in corner constraints. If the original equilibrium (or set of 
equilibria) is robust with respect to such perturbations, then private 
actions will offset sufficiently small policy changes. 

Again, we would like to identify circumstances under which this 
robustness condition is likely to hold. For purposes of illustration, we 
consider “subgame-perfect" Nash equilibria (see Selten 1965,1975). n 
This solution concept demands that agents act in their own best inter¬ 
ests at all times and serves to rule out threats that are not credible, in 
the sense that agents would not be willing to carry them out. Formally, 
a Nash equilibrium is subgame perfect if strategies form Nash equilib¬ 
ria in every proper subgame. In the current context, this implies that 
every choice of /] must be followed by Nash behavior in the ensuing 
simultaneous move-transfer game. Our discussion in example 1 estab¬ 
lishes that, as long as the initial equilibrium entails positive transfers 
in some particular subgame, perturbations of the corner constraints 
will not generally alter behavior in that subgame. If this condition 
holds for every subgame, then the original set of perfect equilibria 
will indeed be robust with respect to arbitrary perturbations of the 
corner constraints. 12 


11 It is interesting to contrast these results with those that follow from use of the 
unrefined Nash concept. There is typically a continuum of Nash eauilibria for the game 
described here, which we construct as follows. If 1 selects some /*, l and 2 play Nash 
choices in the ensuing subgame. For any other choice of l\, 2 subsequently plays = 0 
(l’s choices in these subgames are irrelevant). Effectively, 2 induces 1 to choose /“ by 
threatening to hurt 3 unless 1 complies (of course, if 2 must hurl himself to carry out 
this threat, 1 might choose to call 2's “bluff”; accordingly, these equilibria are not 
subgame perfect). Clearly, 2's ability to punish 1 through 3 determines the set of labor 
supply choices that are sustainable in some equilibrium. Any transfer from 1 to 2 
strengthens this ability and therefore expands the range of potential outcomes. There 
is then no guarantee that private actions neutralize redistributional policies; indeed, if 2 
can select a threat, redistnbutions ordinarily have real effects. It is, however, important 
to reiterate that in such circumstances the central properties of dynastic behavior are 
also lost (see Sec. IV). 

l * The assumption that transfers are positive for every choice of i, is, of course, quite 
strong. However, the basic result typically holds as long as transfers are positive follow¬ 
ing any choice in some neighborhood of the original equilibrium selection, l\. By the 
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The irrelevance of an apparently distortionary tax may, at first, 
seem counterintuitive. Indeed, readers who are unfamiliar with ab¬ 
stract game-theoretic arguments may wish to verify this result directly 
through standard comparative statics. 19 The imposition of a tax 
schedule certainly appears to change the relative price of l’s leisure; 
should this not affect his decision? The fundamental insight here is 
that the price of l's leisure is not simply w since he must also consider 
the effect of his labor-leisure choice on 2’s transfer to 3. Thus he faces 
some “shadow” wage, and it is this shadow wage that is invariant with 
respect to tax policy. 

This invariance is easiest to understand in a single-consumer world. 
As long as the government must balance its budget, no tax can distort 
behavior since the individual knows that all revenues must be re¬ 
turned to him at some point. By way of contrast, in a representative 
consumer world each consumer is thought of as small relative to the 
economy so that the fraction of marginal revenues distributed to any 
one consumer is negligible and (in the absence of altruistic linkages) 
can be ignored. The point of our analysis is that, as long as consumers 
are linked through operative transfers, all marginal revenues associ¬ 
ated with the taxation of a particular individual are, regardless of 
population size, eventually returned to that same individual, just as in 
a single-consumer world. 


III. The Model 

We consider a discrete-time (t = 1, 2, . . .), infinite-horizon over- 
lapping-generations model. For simplicity, we assume that there is 
one composite good that can be either consumed or invested. Current 
output is determined through a constant-returns-to-scale production 
technology as a function of current labor inputs and investment from 
the previous period. Markets are perfectly competitive; firms earn 
zero profits and pay factor inputs their marginal products. Labor and 
capital are each homogeneous so that in period t all labor receives a 
wage rate of w, and capital yields a gross return of a?, paid in period 
t + 1 (the net rate of return is equal to a* - 1). 


preceding argument, ppiicy changes will not alter the consequences of 1 ’s choices in this 
neighborhood; hence, l\ remains a local optimum. Furthermore, if 1 strictly prefers l, 
to alternatives outside of this neighborhood and if equilibrium outcomes foljowing such 
alternatives vary continuously with the values of comer constraints, then l\ remains a 
global optimum for sufficiently small perturbations. This last condition (continuity) is 
relatively weak and, e.g., follows from quasi concavity of u,- 
19 Details of such calculations are available from the authors on request. The reader 
may also wish to consult Beraheim (1986). 
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Let I, be the set of individuals born in period /. We suppose that 
every individual lives for M + 1 periods. Thus /*, the set of individuals 
living at time t, is given by 


M 

f - U 

*-o 

We will use JV, and N* to denote the number of individuals in 1, and /', 
respectively. 

At time t, each individual i £ V chooses consumption (c{), labor 
supply (/*). transfers to other living individuals j E J‘ - {t}), 14 
purchases of physical capital (), and purchases of short-term (single¬ 
period) bonds (</|). Let a, denote thd gross rate of return on period t 
bonds (a, - 1 is the interest rate). It is convenient to define 

= fl 

T» l 

Throughout our analysis, we employ the following notation. Let b' 
denote the vector of i’s transfers in period t: 

b{ * (M>)ye/'-{<}- 

We will use B 1 to indicate the sequence of all transfer choices up to 
period t - 1: 

B' * ((b*),<=/.. (br'W-) 

(similarly for C', V, S', and D*). We define a “I-history" of the economy 
as a complete record of all choices made through the end of period t 
- 1: H' = (C*. L', B', S', O'). 

The government participates in this economy by financing a stream 
of real expenditures through tax levies and bond sales. Throughout, 
we assume that the stream of real expenditures (g t , Is 1) is fixed and 
focus on the effects of alternative financial policies. We allow the 
government to specify period t taxes on individual i as an arbitrary 
function, z{<H*), of the observed /-history. 15 Note that this very general 
specification subsumes taxes on labor income, capital income, and 
transfers. It also allows for idiosyncratic provisions, such as income 


14 One could also allow consumers to lock in transfers for a number of years, includ¬ 
ing transfers to unborn generations. This would change nothing of substance. 

15 Note that we do not allow i\ to depend on current (period t) choices. Effectively, the 
government collects tax revenues at the “end” of each time period (in the last period of 
life, revenues must be collected from the individual's estate or, equivalently, from his 
heirs). This aspect of our mode) is an artifact of discrete time. We model the govern¬ 
ment policy in this way so that private and pwblic transfers are on an equal footing 
(within a single period, both may be conditioned on the same behavior). 
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averaging. The government may also condition one individual’s taxes 
on another’s actions. In short, virtually any action may be taxed, and 
the corresponding rate schedule may be chosen without restriction. 

Given a tax policy and real expenditure stream, the government 
balances its budget by issuing debt. Specifically, the supply of govern¬ 
ment bonds evolves as follows: 

d'(H') + X Z '< H ') = + g, 

iBP 

For arbitrary taxes and expenditures, the implied deficit profile may, 
of course, be infeasible (i.e., debt might eventually exceed economic 
resources). We implicidy exclude such policies from consideration. 

Prevailing prices and government financial policy determine the 
opportunity constraint of each consumer. Specifically, for each t E I', 

c', + s< + di+ X «, + *:<»> 

/e/'-Ul 

= wfi + a,*_ ]j ' -1 + ^ b' p . 

,/e /»-{«} 

In addition, i confronts a number of feasibility constraints: 0 S l‘„ 0 § 
sj, and ^(H‘) S b\ r 

Note that we do not impose any constraints on *’s purchases of 
bonds. In particular, it is possible to have d\ < 0, which signifies that i 
borrows at the competitive rate, a t . 16 Thus we have implicitly assumed 
that capital markets are perfect. 17 

Note also that we allow the lower bounds on t’s period / transfers, 
bfe( H*), to depend on the evolution of prior decisions, H'. Ordinarily, 
we would expect these lower bounds to be invariant with respect to H' 
(generally, they equal zero). The more general formulation adopted 
here allows us to contemplate a wider class of perturbations to the 
corner constraints and, correspondingly, a wider class of alternative 
financial policies. 

We complete the model by assuming that it is possible to represent 
the preferences of each consumer t by a utility function defined over 
choices of consumption and labor: u,(C*, L“). Since we allow for de¬ 
pendence on the entire history of choices, this specification is ex- 

16 We will, however, impose the restriction that d\ 2 0 in the final period of life so that 
<’s ability to bequeath debts is determined solely by the lower bounds on b' ir 

17 H is straightfoward to relax this assumption by artificially limiting borrowing; as 
long as liquidity constraints are generally nonbinding, our central results continue to 
hold. Since the absence of landing liquidity constraints is fundamental to the dynastic 
formulation, it is appropriate for us to focus our attention on this case. 
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trem'eiy general. 18 By imposing additional restrictions, one can obtain 
various formulations employed by other authors, such as Barro’s 
(1974) dynastic specification. We do, however, explicitly rule out di¬ 
rect dependence of preferences on the levels of transfers. This re¬ 
striction is, of course, essential. 

As in the preceding section, our central result will depend on a 
hypothesis about the sensitivity of equilibria to perturbations in the 
corner constraint. Since we generally expect interior equilibria to 
satisfy this hypothesis under a wide range of solution concepts and 
since the hypothesis is in principle verifiable in any particular context, 
we wish to avoid tying our analysis directly to a particular notion of 
equilibrium. Thus we will describe behavior within this economy in as 
general terms as possible. We assume that consumers take the wage 
rates and interest rates as fixed. A given profile of factor prices in¬ 
duces a game in which in each period / consumers choose consump¬ 
tions, labor supplies, transfers, purchases of capital, and purchases of 
bonds. Each distinct /-history H 1 identifies a distinct subgame originat¬ 
ing in period /. Players may condition their period / choices on the 
actual /-history that has resulted from previous play. Thus strategies 
consist of functions mapping /-histories to current choices. 

Individuals are, of course, constrained to select strategies that 
satisfy their opportunity constraints in each period / for all feasible /- 
histories and concurrent choices made by their contemporaries. This 
requirement is more demanding than it might at first appear. In 
particular, individual i cannot simply specify c\, /{, (6# )/€/'-{<>• if, and d\ 
as functions of H' subject to budget balance since concurrent devia¬ 
tions by contemporaries (e.g., a change of b) t for some /) might render 
these choices infeasible. Rather, he must allow one of these variables 
to be determined as a residual. As long as we confine our attention to 
pure strategy equilibria, i has no basis for preferring one residual 
variable to another (he always assumes that other players will select 
their equilibrium choices). For the purposes of our analysis, it is con¬ 
venient to assume that consumption is always the residual variable. 19 

For any particular solution concept, there is a (potentially empty) 
set of equilibria among consumers for the game induced by each 
profile of possible factor prices and interest rates. We will say that a 

18 Note in particular that we do not require the utility of each individual to vary with 
the choices of all other individuals; indeed, w, would ordinarily lx insensitive to changes 
in most of its potential arguments. 

19 For this reason, we cannot (as a formal matter) constrain consumption to be non¬ 
negative in all subgames. Under weak conditions it wilt, however, be positive on the 
equilibrium path. While selecting some other residual variable might well alter the 
specific set of equilibrium outcomes (since this changes the consequences of deviating 
from equilibrium actions for the deviating player), it would not affect our central 
arguments. 
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particular price profile is a “full equilibrium” if there exists an equilib¬ 
rium among consumers relative to these prices such that, along the 
equilibrium path, (i) consumers demand in aggregate the amount of 
bonds supplied by the government and (ii) the aggregate demand for 
capital (labor) equals the aggregate supply of capital (labor). Condi¬ 
tion ii holds as long as full employment of factor supplies generates 
marginal products equal to the assumed factor prices. 

IV. Operative Links and the Linkage Hypothesis 

Barro’s (1974) formulation of the dynastic family employs both a 
restrictive specification of preferences and a restrictive notion of equi¬ 
librium (see n. 2). In addition, he assumes that successive generations 
are operatively linked, in the sense that parents make positive, discre¬ 
tionary transfers to their children. Within his framework, this condi¬ 
tion is sufficient to guarantee that equilibrium behavior is insensitive 
to perturbations in the lower bounds that constrain specific transfer 
derisions. 

In less restrictive models, perturbations of corner constraints could 
conceivably alter equilibrium behavior, even when the corresponding 
transfers are strictly positive. 20 This raises an important question: Do 
Barro’s results require the irrelevance of such perturbations, or do 
they depend only on the weaker requirement that transfers are posi¬ 
tive? In Bemheim and Bagwell (1986), we posed this question in a 
representative consumer model much like Barro’s, except that we 
allowed for a larger class of preferences. Under the supposition that 
an arbitrarily small perturbation to the corner constraint corre¬ 
sponding to some parent-child linkage would have real effects, we 
demonstrated that one could design an arbitrarily small deficit policy 
that would also have real effects. 21 We conclude that the validity of the 


20 One might initially hope to rule out this possibility by imposing appropriate con¬ 
vexity conditions, as in Sec. II. Unfortunately, when the preferences of successive 
generations conflict, convexity of an individual’s decision problem depends not only on 
the characteristics of utility functions and budget constraints but also on the properties 
of equilibrium strategies. In general, it is extremely difficult to guarantee that these 
strategies are well behaved (see, e.g., Kohlberg 1976; Bemheim and Ray 1983, 1987; 
Leininger 1986). Furthermore, in various kinds of noncooperative equilibria, one indi¬ 
vidual may condition his transfer on another's behavior to exert influence. If the first 
individual can credibly threaten to sever financial ties, then the location of his corner 
constraint will affect behavior even if one observes positive discretionary transfers in 
equilibrium (see Bemheim, Shleifer, and Summers 1985). It is very difficult to rule out 
this possibility by imposing restrictions either on the exogenous environment or on the 
notion of equilibrium. 

21 This result actually follows as a corollary of the arguments in Sec. V of this paper 
since there we establish a one-to-one relationship between perturbations to corner 
constraints and fiscal policies. 
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dynastic formulation depends on the irrelevance of perturbations to 
corner constraints; in general, the observation that transfers are posi¬ 
tive does not by itself guarantee dynastic properties. 22 

It is therefore appropriate to formalize the notion of an operative 
link as follows. A “potential link” is a triplet, (i,j, /), such that i and 
i,j £ /* (since i and j are both alive in period t, it is conceivable that i 
could transfer resources to j at that time). Let A be the set of all 
potential links. Consider some X C A, and choose some vector of real 
numbers c = («$)(iv.*>ex- The vector of functions (0ly)(«y.oex is an €- 
perturbation of the original constraints (^)(<^ex if. for all (i,j, ()GX 
and H', |p^(H / )| < t'j. Subsequent to some perturbation, the con¬ 
straints become 


b\ g bJjiH 1 ) + £,(H') 

for all H' and (i, j, t) £ X. We will say that X C A is a set of “jointly 
operative links” if there exists some c > 0 such that no c-pertur- 
bation of (£i)j<i l /,t)ex alters the set of equilibria in the game induced by 
the prevailing profile of factor prices. 

The dynastic model is founded on the assumption that all parent- 
child linkages are jointly operative. Rather than search for a set of 
exogenous conditions that guarantees this result, we simply take the 
assumption at face value and investigate its implications. As a first 
step, we argue that parent-child linkages alone should ordinarily 
suffice to interconnect the entire population. 2 * 

For illustrative purposes, suppose that all individuals marry, that 
each marriage produces two children, and that parents are opera¬ 
tively linked to their children. Then when redundancies are ig¬ 
nored, 24 every individual is indirectly linked through common de¬ 
scendants in the next G generations to 

e-i 

<t>(G) = Z 22,-1 

i-i 

members of its own generation. If we assume (as seems natural) that 
spouses are operatively linked to each other, then one fewer genera- 


as This conclusion has important empirical implications in that it becomes very 
difficult to establish whether links are, in fact, operative. Bernheim et al. (1985) have, 
c.g., presented evidence indicating that comer constraints often matter despite the fact 
that transfers are positive. 

83 In practice, people are connected through operative transfers not just to children 
but also to siblings, nieces, nephews, cousins, charities, political organizations, and so 
forth. Thus we strongly suspect that, even accounting for the childless, very few indi¬ 
viduals or groups of individuals are truly isolated in the sense discussed here. 

u For example, the possibility that siblings have different in-laws. For large popula¬ 
tions and small values of G, this is presumably hn excellent approximation. 



322 JOURNAL OF POLITICAL ECONOMY 

tion is required to establish the same links. Accordingly, to appreciate 
the importance of links spanning two, three, and four generations, we 
note that for C - 2, 3, and 4, 4>(G) is 2, 10, and 42, respectively. 

This, however, is only the “tip of the iceberg.’’ Once we have estab¬ 
lished that one household is connected to another of the same genera¬ 
tion, we may extend the chain further by moving up and down the 
family tree as many times as desired. Thus operative linkages form 
complex networks, perhaps interconnecting large segments of the 
population. Indeed, if each couple is connected to 10 others (through 
grandchildren), then the probability of finding a “cycle” (a set of 
interconnected individuals who are isolated from the rest of the pop¬ 
ulation) seems quite small. 

Although we were unable to obtain any formal results along these 
lines, we did conduct a large number of Monte Carlo simulations. In 
each simulation, we fixed the number of households (N) in the initial 
generation and, under the assumption that each household produced 
two children, arranged marriages between these children. We then 
repeated this procedure for grandchildren. 25 We took all marriages to 
be equally likely 20 and, in particular, did not rule out marriages be¬ 
tween siblings. Out of 100 simulations with N - 20, the population 
was completely interconnected in 96 cases. For N = 50, the figure was 
100 out of 100, and for N — 100, it was 98 out of 100. We also 
conducted 20 simulations for N = 1,000 and found that the popula¬ 
tion was completely interconnected in every case. Furthermore, every 
instance of incomplete interconnection resulted from the existence of 
a single, completely incestuous family (i.e., siblings married siblings in 
two consecutive generations). 27 

25 We also conducted a separate set of simulations in which we limited consideration 
to linkages between the first two generations (i.e., we omitted grandchildren). Although 
complete interconnection was relatively rare, a large number of people were never¬ 
theless linked together. In 100 simulations for N = 20, the largest connected compo¬ 
nent averaged 61 percent of the population; for N = 50 and 100, the figure was 63 
percent. 

26 In reality, not all links are equally probable. Individuals are most likely to marry 
others who live in the same geographic areas, and some communities, such as the 
Amish, are almost entirely self-contained. Note, however, that even with a near-perfect 
caste system, it takes only one “intermarriage” to link the entire population. In practice, 
marital links between identifiable population subgroups are probably quite common, 
particularly since many groups overlap. As a result, we suspect that our assumption is 
probably innocuous. 

27 In Bemheim and Bagwell (1986), we have also discussed some related results from 
the theory of random graphs. Unfortunately, this framework is not ideally suited to our 
current purposes in that it does not impose much of the structure implied by family 
relationships. Nevertheless, calculations based on known asymptotic distributions (see 
Erdos and Renyi 1959) corroborate our simulation results. In one respect, the random- 
graphs framework is superior to the stylized model of family relationship considered 
in this paper. Specifically, the number of edges terminating at each node is determined 
randomly. For purposes of interpretation, one can think of the resulting heterogeneity 
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Note that the complexity of family networks renders perhaps the 
majority of interpersonal linkages redundant: typically, two individ¬ 
uals will be connected through several distinct channels. Conse¬ 
quently, complete interconnection of the population ordinarily fol¬ 
lows under weaker conditions than those imposed by Barro in that all 
parent-child linkages need not be operative. 

On the basis of the preceding observations, we henceforth confine 
attention to equilibria satisfying a certain strong condition, which we 
designate the “linkage hypothesis." 

Linkage hypothesis. There exists a set of jointly operative links X 
and an integer T such that for each t g I the following property holds. 
For all i, j £ P, there exists a finite integer p and sequences (tj, 
.. . ,ip) and (ti, .... r p - 1 ) with i - i i and j = i p such that, for k - 1, 
1, t £ t* § t + T, and either (»*, i* + i, t*) £ X or (i*+ j, **, t*) £ X. 
Loosely, this hypothesis implies that, in each period t, one can find a 
chain of operative linkages connecting any two living individuals, with 
each link consisting of a transfer made sometime between periods t 
and t + T. 

V. The Central Result 

We now demonstrate that, when the linkage hypothesis is satisfied, 
sufficiently small but otherwise arbitrary perturbations of govern¬ 
ment fiscal policy are irrelevant. We define a policy perturbation as 
follows. Consider some vector of real numbers i| = (tj 1 , tjo, tj 2 , t) 2 , . ..); 
(S', (^Dier)"- i is an ij-perturbation of some initial policy (d‘, (z!),e/<)r» i 
if 

IW)! < q‘, 

and 


S'(H') + X €KH‘) = a,- .S'" ‘(H 1-1 ) 
for all t, i £ P, and 1-histories H'. 28 Subsequent to the perturbation, the 


aa reflecting difference* in the number of children. Isolated nodes are then naturally 
interpreted as childless households. As demonstrated by Erdos and Renyi, asymptoti¬ 
cally one obtains (with probability one) one completely interconnected set plus isolated 
nodes. 

* 8 Note that we have allowed the government to alter only its taxes and level of 
borrowing; it cannot change the gross rate of interest that it offers on government 
bonds (a,). This involves no loss of generality since the government can effectively 
change this rate by taxing or subsidizing interest income. 
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government’s deficit and tax policies are given by d'(H') + S‘(H*) and 
zftH') + respectively. 

For the following result, we will assume that the government taxes 
neither interest nor transfers. That is, for all i, t, H', and A' with (C', 
V, S') * (&, V, S'), z<(H‘) = zftft*), £<H') = £<ft‘), d'(H') = tf(ft'), and 
8‘(H') = B'fft'). The introduction of taxes on interest income and 
transfers substantially complicates the analysis. We defer considera¬ 
tion of such policies to Section VI. 

Proposition. Suppose that some full equilibrium satisfies the link¬ 
age hypothesis. For some tj > 0 and every t|-perturbation of the gov¬ 
ernment’s fiscal policy', there exists a full equilibrium in which factor 
prices, labor supplies, consumption decisions, and purchases of physi¬ 
cal capital are all unaffected. In such an equilibrium, the policy per¬ 
turbation simply induces offsetting private transfers and bond pur¬ 
chases. 29 

We establish this proposition by showing that, subsequent to the 
policy perturbation, there must exist an equilibrium for the game 
induced by the original profile of factor prices in which consumers 
select the same levels of consumption and factor supplies as in the 
original full equilibrium and in which the aggregate demand for gov¬ 
ernment bonds changes to match supply in every period. The desired 
conclusion follows immediately. 

We proceed by first considering three simple classes of policy per¬ 
turbations, labeled A, B, and C. These provide the building blocks for 
analyzing more complex fiscal policies. 

Class A 

This class of perturbations consists of transfers between pairs of indi¬ 
viduals who are directly linked. That is, we choose some (z, j, t ) 6 
select £' arbitrarily, and set = -£'(H') for all H'. All other 

aspects of fiscal policy remain unchanged. 

To demonstrate the irrelevance of this policy, we consider the fol¬ 
lowing change of variables for i’s transfer choice. For each history H', 
let 


*® Technically, our result says nothing about the size of t|; conceivably, only very small 
policy perturbations might be irrelevant. However, when one employs the dynastic 
framework, one assumes that iu basic premise—including the assumption that particu¬ 
lar links are operative—is robust with respect to an interesting range of environments 
(otherwise, the model would be inapplicable if the environment changed slightly). 
Fiscal policy is certainly one aspect of the environment. If the dynastic assumptions 
hold for a wide range of fiscal policies, then policy changes within this range are 
irrelevant. 
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- b% + £{H‘) 

(i.r. f we adopt a different change of variables in each subgame). 
Clearly, the same numerical choices yield the same payoffs prior to 
the perturbation and in the transformed game after the perturbation. 
However, subsequent to the perturbation, the transfer constraint be¬ 
comes 

3.>(H‘) g Afy(H') + £<H'). 

This game is therefore strategically equivalent to one in which the 
perturbation (3iy(H t ) « ^{H*) is applied to b' i} . Since i and j are opera¬ 
tively linked, this perturbation of the corner constraint will not affect 
behavior as long as is sufficiently small. But then the policy pertur¬ 
bation is also irrelevant in the sense that it alters only i’s transfer to j. 

Class B 

In this class of policy perturbations, the government issues debt, dis¬ 
tributes the proceeds to some individual i, and retires the debt in the 
subsequent period by taxing the same individual. That is, we choose 
i 6 /' fl P +l , select arbitrarily, and set 8'(H') = -IjJ(H') and 
£' +1 (H' +i ) = -for all H'. All other aspects of fiscal policy 
remain unchanged. 

To demonstrate the irrelevance of this policy, we consider the fol¬ 
lowing change of variables for i’s bond purchases. For each history H', 
let 

o,(H') = d\ - S'(H'). 

Clearly, the same numerical choices yield the same payoffs prior to 
the perturbation and in the transformed game after the perturbation. 
Indeed, these two games are strategically equivalent since there are 
no constraints on borrowing or lending. Equilibrium therefore entails 
the same numerical choices. This implies that i simply increases his 
bond purchases by 8‘(H'). 

Class C 

In this class of perturbations, we consider transfers between pairs of 
individuals who are alive at the same point in time. That is, we choose 
i,j G /', select <■* arbitrarily, and set = -£(H') for all H'. All 
other aspects of fiscal policy remain unchanged. 

Let t'x,..., i p be the sequence of individuals described in the linkage 
hypothesis. For each k = 1 ,,p - 1, define the following policy 
perturbations: 
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esdn - €kH r ) -V, 

a 

It:.,( ir*) = - ~j~- 


For k = 0. p, let 


( 1 ) 


it:.,(h t *) = m 1 ) -V, 


a 


e*‘(H T -) = - ikho 


OL 


( 2 ) 


8 T *(H T ) = sgn(T* - r k + l ) m ‘)~ 


min(T*, t* + ,) s T < max(T*. t* + i), 


where r 0 35 t>+ i = t. 

First, we note that the’ cumulative effect of all these component 
policies is equivalent to the effect of the original policy. In particular, 
the effect on i* in period t*_ i is 


ij-flr* ■) + It* '(ir*- 1 ) = 


0, k = 2. p, 

Cf(H'), k - l. 


Similarly, the effect on i k in period r* is 


+ |;*(H T ‘) 


0, k = , p - \, 

-im% k . p. 


Finally, the change in period t bond issues is 


X 8 T *(ir) = m i ) 

*-o 




sgri(T* - T* +1 ) 


= 0 


since t 0 = j. 

Now note that, for each k, equation (1) is a class A policy and 
equation (2) is a collection of class B policies. We know that class B 
policies are always irrelevant. We can therefore focus on the class 
A policies. Reasoning as before, we see that adoption of all the class A 
policies described in (1) yields a game that is equivalent to one that is 
induced by perturbing the original transfer constraints as follows: if 

(**, i*+i, t*) e x, 



IS EVERYTHING NEUTRAL? 


3*7 


P’U + 1 (*T‘) - £i<H‘) 

— a 

if (**+i» **, t*) € X, 


Since no link need appear twice in this chain, we can obviously make 
the composite perturbation to transfer constraints arbitrarily small by 
taking small. Since a small perturbation of transfer constraints has 
no effect on behavior, the corresponding policy perturbation must be 
irrelevant. 

Having analyzed cases A, B, and C, we are now prepared to con¬ 
sider an arbitrary policy perturbation, (8', 1 - We begin by de¬ 

composing this into component parts. For each t, we define N* - 
N ,- M -1 policy perturbations as follows. For all i G 7* - 7,_ M > let 


_ -S'(H') 

- <H ' if - M-„' 

7<+l/u«+l\ _ 

4 ‘ <H 1 - If- Cm' 

SKID - ■ 


(3) 


Next, define N* - 1 policy perturbations as follows. Choose some 
i* G 7 ,_m and, for all i G I‘ - {i*}, let 

IkH 1 ) = 4!<H') - 4i(H‘) - i'j(H'), 

= (4) 

s.'i(H') = i'(h') + m i ) - 

We now establish that the cumulative effect of these component 
policies is equivalent to the effect of the original policy. The effect on 
debt at time t is clearly 

(N 1 - N.-mW) m g((H<) 


The effect on i G 7* - {**} is 

&<H‘) + §!<H‘) + iKH*) = £<H‘). 
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£ l'i(H') » X t^ H ') + iJ(H') - &H‘)J 

-SW + anfr 1 )- X £(H‘) 

«€/'-{(*) 

= fr(H'), 

where the last equality follows from the government’s budget con¬ 
straint. 

Note that, for each i £ /' - l t -M, (3) is a class B policy. Recall once 
again that class B policies are completely irrelevant. We can therefore 
confine attention to the class C policies described in (4). We know that 
each class C policy induces a game that, after a change of variables, is 
equivalent to one in which we have perturbed corner constraints. 
Similarly, the entire policy induces a game that, after all component 
changes in variables are made, is equivalent to one in which we have 
made all component perturbations in the appropriate corner con¬ 
straints. We need only verify that the transformed game (i.e., the one 
in which variables have been changed) is well defined and that the 
corresponding aggregate perturbation to corner constraints can be 
made arbitrarily small by taking the policy to be small. 

Fix some period t and consider a fink, (i,j, t) £ X. This link might 
appear in any chain connecting any two individuals living in periods 
t — T through t. However, it does not appear in any other chain. The 
total number of potential appearances is therefore finite; specifically, 
it is bounded by (f T ~ l)- 30 Consequently, the composite 

change of variables is well defined. The corresponding total perturba¬ 
tion to the link (*, j, t) is just equal to the sum of the component 
perturbations. We can clearly make this sum arbitrarily small by tak¬ 
ing (V~ r > tio _T ,.. •, t)\ sufficiently small. Now choose e such that 
no c-perturbation to the corner constraints affects equilibrium be¬ 
havior. Since there are a finite number of individuals living in period 
l, we can find some . . . , f|o >( ) > 0 such that, for t| 

satisfying 

(V' T » no“ r » - - •. V. ■no) ^ Ont~ T . *jo7/.^o.<)> 


30 Given the redundancy of interpersonal linkages noted in Sec. IV, it is very likely 
that the number of potential appearances is vastly greater than the number of actual 
appearances. This observation is of some importance because t|—the bound on policy 
perturbations—depends on the number of actual appearances. 
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the aggregate perturbation to each bjj, p \ jt implied by any t|- 
perturbation of fiscal policy satisfies |P,j(H*)| < e{, (H') for all H* and i,j 
with (», j, t) E k. Thus, by choosing t| such that 

(V. Ho) = (min{f|{_ r ./$, min{f||,, f _ r . 

>0 

for all t, we guarantee that the aggregate perturbation to each bjj, {Jj,, 
implied by any ^-perturbation of fiscal policy satisfies |p{,(H‘)| < e^ (H') 
for all H' and ( i,j , /) E k. The proposition is therefore established. 


VI. Extensions and Qualifications 

So far we have confined our analysis to policies consisting of taxes 
levied on earnings, consumption, and income from physical capital. 
The introduction of taxes on interest income and transfers compli¬ 
cates our arguments significantly. Rather than provide a completely 
general extension of our result, we choose instead to illustrate basic 
principles through a simple example. 

Consider an economy consisting of three individuals, 1, 2, and 3, 
whose decision variables and preferences are given as in example 1 
(Sec. II). However, in this case suppose that 2 chooses his transfer to 3 
( 6 2 ) after observing l’s transfer (b\) (i.e., 1 makes his choice in period 
1 , 2 makes his choice in period 2 , and each agent i consumes in period 
i). Suppose further for simplicity that the net interest rate is zero. We 
allow for arbitrary taxes as in Section III. Thus the government col¬ 
lects zj from 1 in period 1, so that 1 consumes w x - b^ - zi. Since // 1 is 
degenerate, z t cannot be conditioned on any actions. In period 2, the 
government collects z 2 (b t ) from 2 (b t completely describes H 2 ), so that 
2 consumes w 2 — b 2 - z 2 (bi). Finally, in period 3, the government 
collects zj( 6 i, 62 ) from 3 (b\ and b 2 completely describe H s ). Budget 
balance requires that z s (&i, b 2 ) * -zj - z 2 (&i)> so that 3 consumes 
w 2 + + z 2 (&,)] + b 2 + zi. This suggests a natural interpretation: zi 

is a lump-sum tax on 1, the proceeds of which are distributed to 3, 
while -z 2 (bi) is a tax on l’s transfer to 3, the proceeds of which are 
distributed to 2 . 

Now consider a policy perturbation, ({j, £ 2 (-)). For the new game, 
employ the following change of variables: 

Pi * + €ii 

P 2 = 62 + [z a (P, + ti) - * 2 (Pi)) + £s(Pi + £i)> 

It is easy to check that the same numerical choices produce the same 
levels of consumption in the original game and in the transformed, 



530 JOURNAL OF POLITICAL ECONOMY 

perturbed game. The only difference is that, in the latter, the con¬ 
straints are 


and 


Pi *«» 


f*2 ^ fotOi + li) ~ 2 aOi)] + feOi + li)- 

Clearly, if the policy perturbation is small and if z a is continuous, then 
the corresponding perturbation to the corner constraints is also small. 
Under the linkage hypothesis, the policy is therefore irrelevant. Thus 
lump-sum redistributions from 1 to 3 have no effects despite the fact 
that I s transfer to 3 is taxed, and changes in the transfer schedule are 
also inconsequential. Unfortunately, a general formulation of this 
result is extremely complex. 51 

While we have couched this discussion entirely in terms of fiscal 
policy, our analysis also implies that prices are locally indeterminate 
and that sufficiently small changes in prices have no effect on the 
allocation of real resources. This conclusion follows directly from the 
observation that changing a price is formally analogous to imposing a 
tax on one party to a transaction and distributing the proceeds to the 
other party. 32 

If prices are irrelevant, then our competitive pricing assumption is 
plainly inessential. Even the assumption of price-taking behavior be¬ 
comes vacuous since one can change the price for any given transac- 


sl This is so for two reasons. First, as is evident From our example, the correspon¬ 
dence between perturbations to policies and corner constraints is significantly more 
complex than in Sec. V (see the formula for the lower bound on f3 2 ). Second, elaborate 
private transfers are required to offset even the simplest government redistributions 
(e.g., lump-sum transfers between operatively linked individuals). Indeed, one must 
rule out cases in which it is logistically impossible for the private sector to neutralize 
public redistributions (e.g., if the government imposes a 100 percent tax rate on all 
transfers to and from some individual, then redistributions involving this individual 
will typically have real effects). 

9 * Technically, this argument raises some subtle issues. First, the formal analogy 
between price changes and tax-transfer schemes does not hold if agents envision them¬ 
selves as trading with a fictitious entity called “the market” rather than with other 
market participants. For prices to be irrelevant, each agent must realize that changes in 
his sales or purchases necessarily alter the sales or purchases of other agents. Clearly, a 
rational agent cannot believe otherwise. Traditionally, one ignores these effects in 
competitive models on the grounds that each agent is small relative to the market just as 
one ignores budget-balancing distributions of marginal tax revenues. Analogous to our 
discussion of taxes, this leads one astray under the current set of assumptions, regard¬ 
less of population size. Second, the general irrelevance of prices follows only if one 
assumes that firms act in the interests of their owners rather than as profit-maximizing 
automata (profit maximization is simply not a sensible objective in the world described 
here). See Bemheiro and Bagwell (1986) for a more complete discussion of these issues. 
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tion without effect. It is therefore not surprising that one can also 
introduce market power without altering our central result. 

Suppose, for example, that in addition to a nonmonopolized con¬ 
sumption good, c, there is also a monopolized consumption good, x. 
The monopolist’s price will, of course, be indeterminate, at least 
within some range. He will exercise his market power by deciding 
who will consume x and in what amounts. We can think of the monop¬ 
olist as making operative transfers of x to others (the recipients of x 
may in turn pass some of it on). At the same time, there will be a 
network of operative transfers in c. Redistributions of c (x) between 
people who are operatively linked in c (x) will be irrelevant. If the 
linkage hypothesis is satisfied for c, then our central result applies 
with respect to redistributions of c. One may even condition such 
redistributions on transfers of x (i.e., tax the exercise of market 
power) without effect. Thus if government fiscal policy entails redis¬ 
tribution of units of account (dollars), the relevant question is whether 
the linkage hypothesis is satisfied for units of account. However, even 
if it is satisfied, market power will still matter: changing the identity of 
the monopolist will necessarily alter the pattern of operative linkages 
in x (certainly, the original monopolist will be driven to corners) and 
will therefore have real allocative effects. 

By now, it should be clear that other restrictive features of the 
model are not central to our analysis. It is, for example, relatively easy 
to disaggregate consumption, capital, and labor. One can then estab¬ 
lish that excise taxes and various partial factor taxes are irrelevant. 
One may also dispense with the assumption of constant returns to 
scale; we maintained this assumption simply to avoid the necessity of 
accounting for distributed profits. Perhaps the most important re¬ 
strictions concern uncertainty and information. Our model describes 
a deterministic world in which all individuals are perfecdy informed. 
For the most part, we believe that these restrictions are also inessen¬ 
tial. 

First, suppose we introduce uncertainty concerning length of life, 
outputs, wages, or gross returns. One could simply view nature as a 
“player,” who selects current values of these variables according to 
some random scheme. One would then include nature’s choices in the 
description of a t-history and proceed as before. 33 Similar remarks 
apply to uncertainty concerning future government policy. Even if 
the government randomizes its actions, individuals may condition 

ss Insurance effects, such as those described in Barsky, Mankiw, and Zeldes (1986), 
would not materialize since interpersonal transfers would neutralize government redis¬ 
tributive policies for each realization. 
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transfers on policy realizations. Thus each realization induces a game 
that is strategically equivalent to the “no-policy” game with perturbed 
comer constraints. Clearly, randomization between equivalent games 
changes nothing of substance. 

Next, suppose that individuals have incomplete information about 
each other’s preferences. As long as an agent makes a transfer with 
probability one, perturbations to the corresponding corner constraint 
will not ordinarily affect his choice. A slight modification of the argu¬ 
ment in Sections II and V then establishes the desired result. 

A somewhat more subtle issue concerns information about opera¬ 
tive linkages. The chains that connect different individuals may be 
complex; indeed, two individuals may not know how they are con¬ 
nected. Yet it is not clear that this knowledge is at all essential. As long 
as individuals correctly perceive the effects of their own actions on 
payoffs, it does not matter if they understand the process that gener¬ 
ates these payoffs. Thus if we prescribe equilibrium actions that offset 
the effects of some transfer policy, individuals will be willing to abide 
by these prescriptions. However, this sidesteps a deep and difficult 
question: How do individuals arrive at the new prescriptions? This 
issue is completely analogous to the observation that if no single agent 
knows the “big picture” and coordinates actions, there is no guarantee 
that an economy will reach a standard competitive equilibrium. To 
resolve this issue, we would require a theory of how agents achieve 
equilibria; unfortunately, this important problem is poorly under¬ 
stood. One could envision an iterative process, wherein each individ¬ 
ual would reactively adjust his transfers, with the property that sta¬ 
tionary points correspond to equilibria. To the extent that individuals 
acknowledge the irrelevance of fiscal policy, the process of adjustment 
following a policy change might actually be very simple: all agents 
hold real activities (consumption or production) fixed and allow trans¬ 
fers to absorb all residua] resources. Finally, if one is unpersuaded by 
these arguments and is unwilling to dogmatically accept the implica¬ 
tions of equilibrium theory, then one must also regard the dynastic 
model with considerable skepticism. 

These waters become still murkier if one allows for uncertainty 
concerning future linkages (e.g., those arising from marriages 
formed after some individual’s death). However, we have argued in 
Section IV that the linkage hypothesis is likely to be satisfied for small 
T: one need only use links spanning a few generations so that most of 
these links might well be known at the relevant point in time. Even 
when this is not the case, one can show that the central result con¬ 
tinues to hold as long as, for each pair of individuals, one can devise 
an algorithm that describes transfers as a function of realized linkages 
(e.g., marriage) and that connects this pair with probability one. Given 
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the huge number of linkages known to exist at each point in time, this 
condition does not seem very demanding. Of course, the process by 
which agents achieve such an equilibrium is again problematic. 

Finally, suppose that individuals cannot observe some set of actions 
taken by others. Asymmetric information of this sort may interfere 
with the neutralization of distortionary taxes. In the proof of our 
central result, we transformed variables differendy for subgames dif¬ 
ferentiated according to prior choices of taxed activities. For this to be 
valid, players with transformed actions must be able to distinguish 
between these subgames; that is, they must be able to observe taxed 
activities. Note, however, that these issues do not bear on the neu¬ 
trality of arbitrary lump-sum redistributions. Indeed, this weaker 
form of neutrality does not even require individuals to observe each 
other’s transfers. 4 

VII. Interpretations and Conclusions 

What should one make of the rather perplexing conclusions reached 
in Sections II-VI? Several points of interpretation require further 
discussion. First, what general principles drive our neutrality results? 
Second, what does this analysis teach us about “real” economics? 
Third, where do we go from here? 

With respect to the first question, one might initially suspect that 
our result follows from the fact that each consumer acts as though he 
is part of a “big happy family," which behaves as if it maximizes a 
single utility function. Certainly, the discussions of Barro (1974) and 
Becker (1974) have this flavor. Yet this suspicion is simply false since 
our results concern neutrality, not optimality. In equilibrium, chosen 
actions may well be inefficient, yet redistributions contingent on these 
choices will not affect behavior. 35 

Aside from delimiting the scope of neutrality, this observation also 
implies that our results have no normative implications. Although all 
taxes are equivalent to lump-sum taxes, lump-sum taxation may not 
be desirable. Since the equilibrium ordinarily entails preexisting dis¬ 
tortions (due to intrafamily conflict), the government should wish to 

34 In example 1, we have assumed that 1 and 2 select transfers simultaneously. This is 
equivalent to leuing 1 choose first and assuming that 2 then selects a transfer without 
having observed l's choice. 

33 Recall example 2 of Sec. II. There we argued that a labor income tax is neutral 
since it leaves l's effective wage unaltered. We did not, however, assert that 1 chooses 
his labor supply optimally in die initial equilibrium. Specifically, we did not say that l's 
effective wage is w. Rather, he faces some shadow wage, which reflects the effect of his 
labor supply choices on subsequent transfers. Since the shadow wage need not equal w, 
his choice is generally inefficient. We argued only that this shadow wage is insensitive to 
tax and transfer policies. '■ 
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engage in second-best taxation. However, paradoxically, second-best 
tax instruments are unavailable. The government can introduce 
countervailing distortions only by conditioning real expenditures on 
consumer behavior. 

With respect to the second question, our analysis casts serious doubt 
on the usefulness of the dynastic framework as an analytic tool for 
studying public policy issues. Accordingly, one must regard any con¬ 
clusions derived within this framework with considerable skepticism. 
Barro’s Ricardian equivalence results, which concern the neutrality of 
public deficits and social security, are probably the best-known impli¬ 
cations that follow from dynastic assumptions. Yet our criticism also 
applies to numerous other studies that adopt Barro’s model. 36 

A natural response to our criticism is that one ought to view the 
dynastic formulation as only an approximation to reality; one should 
therefore expect properties such as Ricardian equivalence to hold 
only as approximations. Taking the premises of this model literally is 
simply unfair and is bound to generate some untenable results. 

We find this position completely unsatisfactory in that both the 
degree and nature of the approximation clearly matter a great deal. If 
we agree that taxes, transfers, and prices are not even close to being 
irrelevant, then we must also agree that in some important, policy¬ 
relevant sense the world is not even close to being dynastic. One 
cannot simply assert that the model holds as a good approximation in 
one context but not in another. It is essential to describe the approxi¬ 
mation explicitly so that analysts can identify a new set of assumptions 
and elucidate their implications. In practice, it is extremely difficult to 
modify the model in a plausible way that preserves Ricardian equiva¬ 
lence (at least as an approximation) while eliminating the untenable 
neutrality results without introducing new and equally troubling 
difficulties (see Abel and Bernheim [1986] and Bernheim and Bag- 
well [1986] for discussions). 

We devote the remainder of this section to the third question: 
Where do we go from here? Clearly, constructive analysis of public 
policy must be based on a model that departs from ours in some 
fundamental way. It is therefore natural to begin by summarizing our 
central premises. First, we assume that operative linkages are quite 
common. Second, we assume that individuals care only about the 
consequences of giving, and not directly about the amount given. To 

*® For example, Abel (1984) demonstrates that social security may have real effects in 
a dynastic world by inducing redistributions between families. In the light of our 
analysis, it is dear that, once one adopts dynastic assumptions, distributional questions 
are ill posed. Cham icy (1981) and Judd (1985) study the welfare effects of capital 
income taxation in dynastic models. Yet the premises of these models may imply neu¬ 
tralization of the very distortion they purport to study. 
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establish the irrelevance of all fiscal policies and prices, we must also 
assume that actions are publicly observable. If one relaxes this last 
assumption and supposes instead that capital markets are perfect, it is 
then possible to demonstrate that all lump-sum redistributions are 
irrelevant. 37 

The last two assumptions (public observability and/or perfect capi¬ 
tal markets) are certainly objectionable on empirical grounds. How¬ 
ever, this does not fully account for our skepticism concerning the 
model’s implications. Suppose, for example, that the government 
adopted and effectively enforced (through enormous penalties) a new 
law requiring public disclosure of all private financial decisions. We 
would not expect fiscal policy and prices to become irrelevant as a 
consequence. We conclude that an important source of our disbelief 
must lie elsewhere. 

The second assumption might fail for several reasons. Generosity 
may be inherently fulfilling. Alternatively, individuals might be my¬ 
opic with respect to the actions of their heirs and simply take the size 
of transfers as a proxy for well-being. Both views are somewhat ap¬ 
pealing, but neither leads to a satisfactory theory of transfers. For 
example, it is difficult to know why an individual would care about the 
magnitude of his transfer if it truly did not affect any real outcome. In 
both cases, the specification of the transfer motive is necessarily ad 
hoc. 

Violations of the first assumption fall into two categories: either a 
large number of people fail to make positive transfers or corners 
matter despite the fact that transfers are positive. Many commen¬ 
tators have indeed claimed that corner constraints bind for most indi¬ 
viduals. Barro (1984) offers a theoretical reason for expecting this 
outcome but does not elucidate implications for policy. Our analysis 
suggests a somewhat different reason, which raises some intriguing 
possibilities. 

To illustrate, consider a world in which there are three successive 
generations (t = 1,2, 3), each consisting of N households. For pur¬ 
poses of interpretation, one should think of each household as a 
married couple. Each member of generation t — 1,2 has two chil¬ 
dren, but, of course, children are shared (a child household is formed 
by the marriage of two individuals who come from two different 
parent households). Suppose that the children of the tth household in 
generation 1 belong to the m(i)lh and /(i)th households in generation 
2 (where these indices are assigned so that everyone in generation 2 
has two forebears in generation 1). Further suppose that the children 

* 7 It may also be possible to dispense with the assumption that capital markets are 
perfect <see Hayashi 1985; Yotsuzuka 1986). * 
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of the ith household in generation 2 belong to the tth and i + 1st 
households in generation 3 (with the exception that N's children be* 
long to the Mh and first households). We select this stylized pattern of 
linkages in order to guarantee that there is a chain that interconnects 
the entire population. 

We will assume that all households are identical within generations. 
The utility of household i in generation t is given by 


u) - 


u(c}) + k[u‘fc t) + u* (<) ], t-\, 
u(cf) + k(u f + uf+ 1 ), t = 2, 
u(cf), t = 3. 


This individual is endowed with initial wealth, uA 

Behavior unfolds as follows. First, each member of generation 1 
chooses its own consumption and transfers to its children. Next, each 
member of generation 2 does the same. Finally, members of genera¬ 
tion 3 consume their endowments plus all transfers received. 

Suppose that along some symmetric subgame-perfect Nash equilib¬ 
rium path, all members of generation 2 make operative transfers to 
their children. 38 The analysis of Section V then establishes that the 
consumption of any household in either generation 2 or 3 depends 
only on the total resources available to all members of those genera¬ 
tions. If N is large, then the marginal propensity to consume from 
wealth for any given individual must be close to zero (this point is 
analogous to Sugden’s [1982] observation concerning the provision of 
charity). Thus each member of generation 1 knows that his gifts will 
have a negligible impact on the consumption of his descendants. In 
contrast, gifts involve a nonnegligible sacrifice of his own consump¬ 
tion. Thus, under relatively weak conditions, no member of genera¬ 
tion 1 will make an operative transfer. 

The main point raised by this discussion is that in large populations 
in which preferences are dynastic and decisions are sequential, large 
numbers of individuals must end up at corners. In addition, the partic¬ 
ular model considered here produces endogenous cycles: one genera- 
ion acts altruistically, making transfers to its children, while the next 
generation, despite being identical to the first, acts selfishly (this re¬ 
mains true as one adds generations). While we do not seriously pro- 
x>se this particular pattern as descriptive of the real world, our analy¬ 
sis does suggest that endogenous behavior may well give rise to 
lattems of operative linkages that do not generate standard dynastic 
results, such as Ricardian equivalence. 

While there are both empirical and theoretical reasons for doubting 


M One can derive relatively weak condition* under which this occurs. 
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that most individuals make positive transfers, we are unable to fully 
attribute our disbelief to this assumption. We suspect that, the thrill of 
victory aside, most individuals would prefer winning $1,000 in a lot¬ 
tery to learning that one of their siblings has won $1,000, despite the 
expectation of future transfers from the parent. Yet dynasticism im¬ 
plies that one should be indifferent. 

We are therefore led to reexamine the other aspect of our first 
assumption: corners matter, even though they do not bind, in the 
traditional sense. As far as we know, the only fully elaborated theories 
that are compatible with this view envision transfers as a means of 
facilitating exchange within families (see, e.g., Kotlikoff and Spivak 
1981; Bernheim et al. 1985). Accordingly, we believe that subsequent 
policy analyses should consider more carefully the implications of 
nonstandard alternatives to the dynastic transfer motive. 
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Intertemporal Substitution in Consumption 
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Stanford University and National Bureau of Economic Research 


One of the important determinants of the response of saving and 
consumption to the real interest rate is the elasticity of intertemporal 
substitution. That elasticity can be measured by the response of the 
rate of change of consumption to changes in the expected real inter¬ 
est rate. A detailed study of data for the twentieth-century United 
States shows no strong evidence that the elasticity of intertemporal 
substitution is positive. Earlier findings of substantially positive elas¬ 
ticities are reversed when appropriate estimation methods are used. 


I. Introduction 

A higher expected real interest rate makes consumers defer con¬ 
sumption, everything else held constant. The magnitude of this inter¬ 
temporal substitution effect is one of the central questions of mac¬ 
roeconomics. If consumers can be induced to postpone consumption 
by modest increases in interest rates, then (1) movements of interest 
rates will make consumption decline whenever other components of 
aggregate demand rise—total output will not be much influenced by 
changes in those components; (2) the deadweight loss from the taxa¬ 
tion of interest is important; (3) the burden of the national debt or 
unfunded social security is relatively unimportant; and (4) consump¬ 
tion will move along with real interest rates over the business cycle, to 
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name four of the many issues that rest on the intertemporal substitut¬ 
ability of consumption. 

This paper estimates parameters of the representative individual’s 
utility function rather than parameters of the consumption function 
or savings function. As Lucas (1976) has pointed out, there may not 
be anything that could be properly called a consumption or savings 
function: the relation between consumption, income, and interest 
rates depends on the wider macroeconomic context and may not be 
stable over time, even though consumers are always trying to max¬ 
imize the same utility function. The techniques of this paper are more 
robust with respect to this kind of instability than standard 
econometric models of consumption and savings. 

The essential idea of the paper is that consumers plan to change 
their consumption from one year to the next by an amount that 
depends on their expectations of real interest rates. Actual move¬ 
ments of consumption differ from planned movements by a com¬ 
pletely unpredictable random variable that indexes all the informa¬ 
tion available next year that was not incorporated in the planning 
process the year before. If expectations of real interest rates shift, 
then there should be a corresponding shift in the rate of change of 
consumption. The magnitude of the response of consumption to a 
change in real interest expectations measures the intertemporal elas¬ 
ticity of substitution. All this is set up in a formal econometric model 
in which the assumptions are formalized and the estimation tech¬ 
niques rigorously justified. 

Over the postwar period, there have been downward and upward 
shifts in the expected real return from common stocks, Treasury bills, 
and savings accounts, three of the investments that govern the real 
interest rate for consumers. Over the same period, there have been 
only small shifts in the rate of growth of consumption. Consequently, 
all the estimates presented in this paper of the intertemporal elasticity 
of substitution are small. Most of them are also quite precise, support¬ 
ing the strong conclusion that the elasticity is unlikely to be much 
above 0.1, and may well be zero. 

II. Theory of the Consumer under Uncertain 
Real Interest Rates 

Finance theory has examined the role of the consumer in an economy 
with one or more securities with stochastic returns. Breeden (1977, 
1979) was the pioneer in what has become known as the consumption 
capital asset pricing model. Hansen and Singleton (1983) provide an 
application of the model to macroeconomic consumption data. Man- 
kiw, Rotemberg, and Summers (1985) have extended the model to 
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include labor supply. The basic model of the joint distribution of 
consumption and the return earned by one asset that has emerged 
from that research is the following: The joint distribution of the log of 
consumption in period t, c„ and the return earned by the asset from 
period t — 1 to period t, r,_ lt is normal with a covariance matrix that is 
unchanging over time. The means obey the linear relation 

l t = of ,-1 + c,- 1 + k. ( 1 ) 

That is, the expected change in the log of consumption is a parame¬ 
ter, a, times the expected real return plus a constant. 

The simplest rationalization for this model of the joint distribution 
of the two variables is based on the hypothesis that the consumer 
maximizes the expected value of an intertemporally separable utility 
function 

-<l/cr>]£, (2) 

For the purposes of deriving the joint distribution of consumption 
growth and the real return, it is not necessary to make specific as¬ 
sumptions about the market setting of the maximization. At one ex¬ 
treme, the consumer could face a full set of markets in contingent 
commodities, and then the budget constraint would say that the sum 
of all the consumer’s demands for the contingent claims valued at 
market prices would equal his endowment. At the other extreme, the 
consumer could be Robinson Crusoe, with a single risky investment in 
a real asset. Then the budget constraint would say that his holdings of 
the real asset could never be negative. For a further discussion of this 
point, see Grossman and Shiller (1982). 

In any case, one of the many choices facing the consumer is to 
spend a little less in year t — 1, invest the savings in one asset, and 
spend the stochastic proceeds in year t. Suppose that a unit invest¬ 
ment in year t - 1 has the stochastic return e T in year t. The first- 
order condition for the deferral of a small amount of consumption 
from period t — 1 to period t, considered from the point of view of a 
consumption decision made in t - 1, is 

E t -x[e T -'- {Va)c ' - = 0. (3) 

Equation (3) is the precise mathematical formulation of the principle 
that the marginal rate of substitution should equal the ratio of the 
prices of present and future consumption. Under uncertainty, it is 
not true that the expected marginal rate of substitution should equal 
the expected price ratio (the discount function). Rather, the appropri¬ 
ate discount rate is the risk-adjusted one described by the first factor 
in equation (3); it is the expectation of the product of the discount 
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function and the marginal stochastic utility next period. This expres¬ 
sion is related to the “consumption beta” of modern finance theory. 

The reallocation condition of equation (3) is the generalization of 
the proposition investigated in my earlier paper (Hall 1978) that mar¬ 
ginal utility should be a trended random walk when real interest rates 
are constant over time. Further progress in translating the realloca¬ 
tion condition into consequences for observed variables requires as¬ 
sumptions about the distributions of the random influences. A set of 
assumptions related to those introduced by Breeden (1977) seems a 
natural approach. First, assume that the real interest rate, r,_ Jt condi¬ 
tional on information available in year t — 1, obeys the normal distri¬ 
bution with mean F,_j. Because interest rates as they are defined in 
this paper can be indefinitely negative, the normal distribution is a 
natural assumption. Second, assume that the consumer’s rule for pro¬ 
cessing new information about income and interest rates makes the 
distribution of consumption lognormal, conditional on information 
available last year; that is, log c t is normal with mean c,. Because the 
new information arriving in year t has a bearing both on the actual 
return to investments maturing in t and on the consumer’s long-term 
well-being estimated in that year, the two random variables r t -\ and c, 
will be correlated. 

Applying the intertemporal allocation condition under the assump¬ 
tions of lognormality gives the relation between the expected value of 
the log of consumption in period t given consumption in period t - 1 
and the mean of the distribution of the real interest rate; 

c, - of,_| + c t -1 + k. (1) 

Here k is a constant that depends on the variances and covariance of z, 

' r, and c. I will assume that it does not change significantly over time. 

This condition says that the mean level of consumption in period t 
generated by the consumer’s choice as of period / - 1 is the level of 
consumption chd$en for period / — 1 plus a constant plus an adjust¬ 
ment positively related to the mean of the real interest rate. A high 
value of <r means that, when the real interest rate is expected to be 
high, the consumer will actively defer consumption to the later pe¬ 
riod. 

The condition is a constraint on the consumption rule. It says that 
an optimal rule will wind up choosing a level of consumption in pe¬ 
riod t, after the new information becomes available, whose mean 
obeys this restriction. The condition is not a complete description of 
consumption behavior under uncertainty. It does not describe the 
actual amount by which consumption changes when new information 
about income or asset returns becomes available. 

The actual log of consumption in period t, c,, differs from the 
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mean, ? /( by a completely unpredictable surprise, which 1 will call e t . 
By the hypotheses already stated, e ( is a normal random variable. The 
two equations of interest can be put in the form of a bivariate regres¬ 
sion: 

c, - of t -1 + c t -1 + k + e f , (4) 

r t -1 = f t -1 + v,. (5) 

The random variable v t also has the normal distribution. 

If the expected real interest rate f t -1 is observed directly, then the 
key parameter a can be estimated simply by regressing the change in 
the log of consumption on the expected real rate. That regression also 
has the property that no other variable known in period t - 1 belongs 
in the regression. The strong testable implication of the theory is that 
the mean of the rate of growth of consumption is shifted only by the 
mean of the real interest rate. Information available in year / - 1 is 
helpful in predicting the rate of growth of consumption only to the 
extent that it predicts the real interest rate. This testable implication is 
the logical extension of the one derived in my earlier paper (Hall 
1978) under constancy of real interest rates. In that case, no variable 
known in year t - 1 should help predict the rate of growth of con¬ 
sumption. 


A. Interpretation of the Parameter cr 

In the model of finance theory, based on the maximization of the 
expected value of the intertemporally additive utility function of 
equation (2), the parameter o is interpreted as the reciprocal of the 
coefficient of relative risk aversion. However, a is the intertemporal 
elasticity of substitution as well. The two parameters are assumed to 
be reciprocals of one another. If consumers are highly risk averse, 
they must have low intertemporal substitution as well. For the pur¬ 
poses of this research, it would be desirable to eliminate this automatic 
connection between intertemporal substitution and risk aversion. The 
empirical finding that intertemporal substitution is weak or absent 
does not contradict any widely held beliefs about consumer behavior. 
But the corresponding conclusion that the coefficient of relative risk 
aversion is close to infinity is incompatible with the observed willing¬ 
ness of consumers to take on risk. Hence, it is desirable to show that 
the finding of this paper has no automatic bearing on risk aversion. 

It is a topic of active research in the theory of the consumer as to 
how to characterize preferences about uncertainty and intertemporal 
choice. It is apparent that the relationship between the rate of growth 
of consumption and the expected real interest rate is governed by the 
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intertemporal substitution aspect of preferences. That is, the parame¬ 
ter a in equation (1) is on its face the intertemporal elasticity; it is 
literally the elasticity of the consumption ratio to the corresponding 
relative price. Under certainty, where only intertemporal substitution 
has a role, equation (1) would hold for an intertemporally separable 
utility function with a constant elasticity of substitution, a. 

With the maximization of the expected value of the utility function 
(2), it is unarguably the case that 1/a is the coefficient of relative risk 
aversion as well. However, recent work has shown that equation (1) 
does not reveal the coefficient of relative risk aversion under less 
restrictive assumptions about preferences. The earliest research on 
finding a clean separation between intertemporal substitution and 
risk aversion appears in a characterization of preference orderings 
with only two time periods by Selden (1978) in what he calls the 
ordinal certainty equivalence (OCE) framework. The OCE setup de¬ 
parts from expected utility but retains additive separability. In the 
OCE framework, the relation between consumption growth and ex¬ 
pected real interest reveals just the intertemporal elasticity of substitu¬ 
tion and says nothing about the coefficient of relative risk aversion. 
Selden lets one concave function describe risk aversion in the second 
period. With it, risk aversion has the effect of reducing uncertain 
future consumption to its certainty equivalent. Then a second utility 
function describes intertemporal substitution between current con¬ 
sumption and the certainty equivalent of future consumption. Under 
the assumption that both utility functions have the constant-elastic 
form, and the same lognormal assumptions about the interest rate 
and future consumption, the OCE framework gives rise to exactly 
equation (1) (for details, see Hail [1985]). It is an unambiguous con¬ 
clusion that the intertemporal elasticity of substitution alone controls 
the relation between consumption growth and the expected real in¬ 
terest rate. Unfortunately, the OCE framework does not generalize in 
any known way to more than two periods (see Johnsen and Donald¬ 
son 1985). 

More recendy, pardy in response to remarks in earlier versions of 
this paper, a number of authors have developed representations of 
intertemporal preferences under uncertainty that depart from ex¬ 
pected udlity in a way originally proposed by Kreps and Porteus 
(1978). Attanasio (1987), Epstein and Zin (1987a, 19875), Weil (1987), 
and Zin (1987a, 1987 b) have all shown that, under suitable assump¬ 
tions, the Kreps-Porteus setup implies that the coefficient a is the 
intertemporal elasticity of substitution and not the reciprocal of the 
coefficient of relative risk aversion. 

In the framework developed here, the bivariate relation between 
consumption and real interest rates does not necessarily reveal any- 
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thing about risk aversion. Estimation of the risk aversion parameter 
would be possible in a multivariate system that considered the real 
returns to two or more assets. Then the magnitudes of the risk pre¬ 
miums together with the correlations of the returns with consumption 
would provide estimates of the coefficient of relative risk aversion. 

In this paper, I will refer to the parameter a as the intertemporal 
elasticity of substitution; I do not think this interpretation is at all 
controversial. Readers who have a strong prior belief that the utility 
function is additively separable and that consumers follow the princi¬ 
ple of maximizing expected utility will also interpret a as the recip¬ 
rocal of the coefficient of relative risk aversion. Others, such as my¬ 
self, will avoid drawing any conclusions about risk aversion from the 
results presented here. 

B. Relation to Hansen and Singleton 

Hansen and Singleton (1983) studied the joint distribution of the rate 
of growth of consumption and asset returns in the conventional ex¬ 
pected intertemporal utility framework. They do not mention inter¬ 
temporal substitution in their discussion at all. They identify the sin¬ 
gle critical parameter they estimate as the coefficient of relative risk 
aversion. Their statistical model is the same as the one derived here. 
Their estimation technique, based on maximum likelihood in a bivari¬ 
ate system, examines the relation between the rate of change of con¬ 
sumption and expected real asset returns and interprets the coeffi¬ 
cient as the reciprocal of the coefficient of relative risk aversion. In 
their framework, as I mentioned above, the coefficient is also the 
intertemporal elasticity of substitution. The argument offered here 
suggests that Hansen and Singleton's estimated coefficient may not be 
informative about risk aversion. However, I do not offer any evidence 
on this question one way or the other. 

Hansen and Singleton (1983), and Grossman and Shiller (1982) 
before them, are on firm ground in treating the differences in returns 
among assets as revealing something about risk aversion. Indeed, 
Hansen and Singleton’s rejection of the cross-equation restrictions in 
a model combining consumption growth with returns on multiple 
assets may occur because the intertemporal elasticity of substitution is 
different from the reciprocal of the coefficient of relative risk aver¬ 
sion. 

III. Expectations of the Real Interest Rate 

I take two approaches to the measurement of the expected real inter¬ 
est rate. First, I study changes in consumption over a period for which 
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survey data on expected price changes are available. The expected 
real interest rate is the market nominal rate for an instrument of 
suitable term, adjusted for taxes, less the expected rate of change of 
the price level. Real returns from the stock market can also be used in 
this framework because survey data on expected nominal stock prices 
are available. 

The second approach relates the conditional mean of the real inter¬ 
est rate, r,_ t , to observed variables known to consumers at the time 
that they choose c,_i. Recall that r,_ x is the mean of the subjective 
distribution for the real interest rate held by the typical consumer at 
the time consumption decisions are made for year t - 1. A specifica¬ 
tion for expectations that has been employed frequently in macroeco¬ 
nomic models derived from rational expectations and, in particular, 
underlies the recent work of Hansen and Singleton is derived as 
follows. Let the mean of the subjective distribution be a linear combi¬ 
nation of observed variables, 

r,_, =x / _ip, (6) 

and suppose that the coefficients, (L are known in advance. Under 
this specification, the complete model of expectations and consump¬ 
tion becomes a simple application of bivariate regression with param¬ 
eter constraints across the equations. Alternatively, the same estima¬ 
tion technique can be thought of as instrumental variables applied to 
the consumption equation, with the determinants of the expected real 
rate as the instruments. The second interpretation is the one adopted 
in this paper, in which all estimates are obtained by instrumental 
variables except when expectations are observed directly. 


XV. Time Aggregation 

The basic equation for the rate of change of consumption, 

Ac, = af t -1 + k + e„ (7) 

refers to consumption in discrete time. From the derivation in Section 
II, it is also apparent that it applies to observations on the instanta¬ 
neous flow of consumption measured at two points of time in a setup 
in which time is measured continuously. However, it does not cor¬ 
rectly characterize the behavior of time averages of consumption. If c, 
is the average flow of consumption over an interval of continuous 
time, then the relation of its rate of change to the real interest rate is 
more complex. My discussion will note the difference between time 
aggregation of the level of consumption and aggregation of its 
logarithm. The difference is trivial for consumption because it 
changes so little during any given year. 
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As with other aggregation problems in econometrics, time aggrega¬ 
tion for the left-hand variable causes only mild problems. If the right- 
hand variable is observed continuously, or at least quite frequently, 
then the aggregation of the left-hand variable in effect defines an 
appropriate way to aggregate the right-hand variable. The problem 
of time aggregation becomes much more difficult if only a time aver¬ 
age of the right-hand variable is available (see Grossman, Melino, and 
Shiller 1985). However, in the present case, interest rates and rates of 
inflation are measured monthly or more frequently over the whole 
time span for which any data at all are available for consumption, so 
the time aggregation problem is readily soluble. 

Suppose that only a time average of consumption is observed, say 
once a year. Each month, the expected real interest rate is known; call 
it n,m with t the year and m the month. There is an unobserved c, „ each 
month, and it evolves as 

A c t%m = of tm _, + e /m . (8) 

Now write out c,_ i, m and c, m as increments over the initial value e,_ 
Note that, to a close approximation, 

~ ~Y% 2c '* + lo 8 12 - ( 9 ) 

Then a little manipulation shows that the change in aggregate con¬ 
sumption is 

12 

AC, = ~ X i” 1 ~ 1 )(°^/ - 1 .m - 1 + Ef-l.m) 

*«1 

12 < 10 > 
+ "jy X < 12 “ m + iXofU-i + «!.«.)• 

Define the time aggregates of the expected real interest rate and the 
random element as 

r,-i = [2(m - l)f,-i. w _, + 2(12 - m + l)r,(11) 

e< «= _L [2( m - l)e ( _, tW + 2(12 - m + l)€,, m ]. (12) 

Then the relation among the time aggregates is 

Ac, = of,_i + e t . (13) 

Two properties of the aggregate random element e, call for note. 
First, as Working derived in a famous paper (1960), e, is not white 
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noise; rather, it obeys a first-order moving average process with serial 
correlation .25. Second, e, is likely to be correlated with r,_i or with its 
determinants or instruments, even if these variables are uncorrelated 
at the monthly level. 

The combination of serial correlation in the residuals and endoge¬ 
nous instrumental variables calls for an estimator designed to deal 
with these circumstances. Hayashi and Sims (1983) have provided 
what seems to be the most suitable estimator for this problem. They 
propose that the data on the left- and right-hand variables undergo a 
preliminary transformation that yields a scalar covariance matrix for 
the disturbances. However, the transformation must also preserve the 
timing conditions that make the instruments and the transformed 
disturbances orthogonal. The standard autoregressive transforma¬ 
tion destroys the timing conditions. On the other hand, an autore¬ 
gressive transformation that subtracts future rather than past values 
will preserve the timing conditions and accomplish the necessary 
transformation. Application of the Hayashi-Sims estimator in the 
present case is particularly easy because the time-series process for the 
disturbances is prescribed by theory and does not need to be esti¬ 
mated in a preliminary stage. The process is first-order moving aver¬ 
age with a parameter of 0.27. The corresponding autoregressive 
transformation can be closely approximated as 

Ac, = Ac, - .27Ac,+ i + .07Ac, +2 . (14) 

This is the first two terms of the infinite autoregressive representation 
of the first-order moving average process. The same transformation 
is applied to the real interest rate variable. Then the instrumental 
variables estimator is applied to the transformed variables, using un¬ 
transformed lagged variables as instruments. Hayashi and Sims show 
that the resulting estimates are consistent and that the standard esti¬ 
mate of the covariance matrix of the estimates is also consistent. 

In estimating the time-aggregated Euler equation, the timing of the 
instruments turns out to be critical. If the data measured the instanta¬ 
neous flow of consumption at two isolated points, any variable known 
at the time that c,_] was chosen would be eligible as an instrument. 
However, when c,_j is an annual average, it is apparent that any 
variable measured during calendar year t - 1 can be correlated with 
the disturbance €,. For annual data, the most recent permissible in¬ 
strument is one measured in December of year t - 2. Annual aggre¬ 
gates for year t - 2 and earlier are usable, but not those for year / - 
i. The most recent variables eligible as instruments are the change in 
annual log consumption in year t - 2, the level of the average real 
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return over year t — 2, and the nominal return in December of year 
t - 2. 

V. Data 

Following are brief definitions of the data series used in this study: 
c t = log of real consumption of nondurables (not including services) 
in year, quarter, or month t, from the U.S. National Income and 
Product Accounts; the data are available monthly from 1959, quar¬ 
terly from 1947, and annually from 1919 (for derivation before 1929, 
see Hall [1986]); r, = realized real return after taxes on an investment 
in the Standard and Poor’s (S&P) 500 stock portfolio, liquidated at a 
later date corresponding to the consumption variable, or realized real 
return after taxes from a savings account earning the regulated pass¬ 
book interest rate or realized real return after taxes from holding a 
sequence of four 90-day Treasury bills over the year; h, = log of the 
S&P 500 index of share prices, deflated; d t = dividend yield of the 
S&P 500; z, = nominal yield of Treasury bills, discount basis; q t = 
nominal passbook interest rate in the third quarter; p, - log of the 
implicit deflator for consumption of nondurables (used as a deflator 
for all deflated variables). 

After-tax magnitudes were calculated using the effective marginal 
rate under the federal personal income tax from Barro and Sahasakul 
(1983). The full nominal amount of dividends and interest was as¬ 
sumed to be taxed at this effective marginal rate. Capital gains and 
losses were assumed to be untaxed on the grounds that the combina¬ 
tion of low statutory rates, taxation only at realization, and' forgive¬ 
ness of accrued gains at death makes the effective rate dose to zero. 
All data for the study are available from the author on an IBM dis¬ 
kette. 


VI. Summary of Results 

Following is a brief summary of the various attempts 1 have made to 
estimate the intertemporal elasticity of substitution by estimating the 
relationship of the rate of change of consumption to expected real 
interest rates. 

The first set of results uses inflation and stock price expectations 
recorded in the Livingston survey (see Sec. VII). In this work, the 
expected real return is measured directly and the elasticity of sub¬ 
stitution estimated by simple regression. For real returns in the stock 
market, the results are informative: the elasticity of substitution is 
close to zero and the estimate has a small standard error. For savings 
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accounts and Treasury bilb, the estimates are almost useless because 
of large standard errors. In these cases, the lack of variation in the 
expected real return makes it difficult to estimate the elasticity. 

A second set of results uses annual changes in consumption starting 
in 1924. The real return on Treasury bills is aggregated from 
monthly data as suggested in Section IV. Because this technique uses 
a longer span of data and uses all the data for each year, the standard 
error of the estimate of the intertemporal elasticity is much smaller. 
The point estimate of the elasticity is negative. All positive values lie 
outside the 95 percent confidence interval. 

A third set of results reconciles the findings of this paper—that the 
intertemporal elasticity is around zero—with Hansen and Singleton’s 
finding of large positive elasticities. The difference comes from their 
choice of a time period for estimation and from their use of instru¬ 
ments that are correlated with the innovation in the real return. 

A fourth set of results examines Summers’s (1982) findings of in¬ 
tertemporal elasticities of around one, using quarterly postwar data. 
Again, use of an appropriate estimator reverses his conclusion. 

My overall conclusion from ail four sets of results is that the evi¬ 
dence points in the direction of a low value for the intertemporal 
elasticity. The value may even be zero and is probably not above .2. 

Before plunging into formal econometric results, I think it is useful 
to indicate why the data point toward the answer that pervades the 
results of this paper, namely, that the intertemporal elasticity of sub¬ 
stitution is small. Some simple facts about the data are apparent just 
by taking averages over 5-year intervals. The averaging removes most 
of the random expectation errors but turns out to leave a good deal of 
variation in the real interest rate. Figure 1 shows the real after-tax 
return on Treasury bills and the rate of change of consumption for 
intervals from 1921 through 1940 and 1946-83 (the last interval is 
only 3 years long). 

Except for three of the observations, the rate of change of con¬ 
sumption is close to its average value of a little below 3 percent per 
year. When consumption was near average, however, the real interest 
rate varied from -5 percent to +5 percent. The only observation 
combining a high real interest rate and rapid consumption growth 
was for 1921-25, in the upper right-hand comer. The other observa¬ 
tion with high consumption growth was for 1936-40, when the real 
interest rate was almost exactly zero. The period 1931-35 had a high 
real interest rate and slightly negative consumption change. As a 
general matter, figure 1 makes a fairly strong case that periods of 
high real interest rates have not typically been periods of high con¬ 
sumption growth. Rather, consumption growth has generally stuck 
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Fig. 1.—Five-year averages of the real return on Treasury bills (horizontal axis) and 
the rate of change of consumption (vertical axis), 1921-40 and 1946-83. 


fairly dose to its average value no matter what has happened to real 
interest rates. 

VII. Results Based on the Livingston Survey 

Each November, Joseph Livingston asks a panel of economists to 
predict the values of a long list of economic variables for the following 
June. Among the variables are the consumer price index and the S&P 
400 stock price index. From these, it is possible to construct three 
measures of expected real returns that are relevant for consumers. 

Treasury bills .—The starting point is the market value of a bill 
maturing in June as reported in November. All elements of die ex¬ 
pected real rate are known except for the marginal tax rate, which is 
highly predictable. I computed the expected real return as 
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Here z is the nominal return measured in discount form at an annual 
rate (as a decimal), m is the marginal tax rate, p N is the known price 
level in November, and pj is the expected price level in June. 

Savings accounts .—Nominal bank rates, q, are not entirely known in 
advance but are highly predictable. I compute the expected real after¬ 
tax return as 


log^ 7 ' 12 * - m[ 1 - ,< 7 ' 12 *]} (16) 

Stocks .—I treat the dividend yield, d, as known and use the survey 
data for the expected share price. The expected real after-tax return 
is 


S[tt (1 (17) 

Here An is the known stock price index in November and Aj is the 
expected index for the following June. 

The dependent variable, the log of the change in consumption per 
capita, is constructed to match the Livingston data as closely as possi¬ 
ble. Monthly data on consumption in November and June are divided 
by monthly estimates of the U.S. population. 

The results from regressing the log change in consumption on 
these three measures of the expected real return are the following 
(standard errors are in parentheses): 


Security 

Estimate of a 

Standard 

Error 

Durbin-Watson 

Treasury bills 

.346 

(.337) 

.0169 

2.13 

Savings accounts 

.271 

(.330) 

.0170 

2.17 

Stocks 

.066 

(.050) 

.0166 

2.36 


The results for Treasury bills and savings accounts are hardly conclu¬ 
sive. The variation in the expected real returns over the 24 7-month 
periods in the data is inadequate to provide any useful information 
about the elasticity of substitution, o. But for the stock market, the 
results are conclusive. The estimate of a is close to zero and the 
standard error is small as well. The confidence interval for <r excludes 
all values that correspond to strong intertemporal substitution. 
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VIII. Results from Annual Data with Consistent 

Time Aggregation 

Annual averages of consumption are available starting just after 
World War I. Monthly data on the realized return on Treasury bills 
can be calculated for the same period. Aggregation of the real return 
data to annual rates, as described in Section IV, makes it possible to 
estimate the intertemporal elasticity of substitution from a much 
longer historical record with much more variance in expected real 
returns. The estimate of the intertemporal elasticity of substitution 
obtained by applying the Hayashi-Sims estimator with the annual log 
of the change in consumption per capita as the dependent variable 
and realized real returns as the independent variable, with the change 
in consumption 2 years earlier, the; realized real return 2 years earlier, 
and the nominal bill rate in December 2 years ago as instruments, for 
the years 1924-40 and 1950-83, is 

estimate of or: —.40 D-W: 2.09; SE: 2.5 percent. 

(- 20 ) 

The finding of a negative value of the intertemporal elasticity of 
substitution was not sensitive to the choice of instruments as long as 
endogenous variables from year t - 1 were excluded. Separate esti¬ 
mates for the pre- and postwar periods showed that the estimate was 
somewhat negative in the earlier period and positive for the later 
period. However, the pooled estimate clearly rejects all positive values 
of a. It simply cannot be said that the relation between the real return 
and the rate of change of consumption supports strong intertemporal 
substitution. Of course, the finding of a negative estimate of o cannot 
be taken literally since it implies nonconcave utility. Rather, the con¬ 
clusion I draw is that the case for a significantly positive value of a 
cannot be made in this framework. 


IX. Results Based on Recent Monthly Data 

Hansen and Singleton (1983) obtained results with monthly data that 
can be interpreted as evidence of large values for a. Although I have 
not attempted to reproduce their results exactly, simple instrumental 
estimates do give high estimates of o, especially over the particular 
time period they studied. For example, for data from October 1959 
through December 1978, with the real rate lagged 1-6 months and 
the rate of change of consumption lagged 1, 2, and 3 months as 
instruments, I obtain 

estimate of or: .98 D-W: 2.59; SE: .81 percent. 

(.33) 
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The real return is computed from monthly averages of daily data. 
Following Hansen and Singleton, I have not made adjustments in the 
real return to take account of time aggregation nor have 1 used the 
Hayashi-Sims estimator. Incorporating data through December 1983, 
I get a somewhat lower value with the same procedure: 

estimate of cr: .48 D-W: 2.64; SE: .79 percent. 

(- 22 ) 

Hansen and Singleton’s use of the immediately lagged change in log 
consumption as an instrument does not give rise to a consistent esti¬ 
mate of a when the dependent variable is the change in a time aggre¬ 
gate. Section IV showed that last year’s change in consumption de¬ 
pends on some of the same random disturbances as this year’s change. 
The most recent change in consumption admissible as an instrument 
is the one lagged 2 years. 

Because data on the price level are compiled no more frequently 
than monthly, it is not possible to apply the full apparatus developed 
earlier in this paper for monthly data. However, it is possible to come 
close. A good approximation to the correct time aggregate of the real 
return on Treasury bills can be computed, with respect to the change 
in consumption between last month and this month, by using the 
simple average of the Treasury bill yields in each of the months, 
adjusted for taxes, less a moving average of price changes. The mov¬ 
ing average gives a weight of. 125 to next month’s price change, .75 to 
this month’s, and .125 to last month’s. These weights can be derived 
by combining a simple interpolation formula with the aggregation 
process derived in Section IV. Then the Hayashi-Sims estimator can 
be applied. Because the computed real return uses an additional fu¬ 
ture month’s price data, the earliest value that can be used as an 
instrument is from 3 months ago. However, the observed nominal 
yields on Treasury bills 2 and 3 months ago can be used as instru¬ 
ments. Making all these changes, including dropping i from the 
list of instruments, further reduces the estimate of o: 

estimate of <r: -.03 D-W: 3.00; SE: .91 percent. 

(-38) 

For the stock market, use of recent monthly data does not change the 
conclusion reached by Hansen and Singleton and the earlier results in 
this paper that the estimate of the elasticity is reliably low: 

estimate of a: .03 D-W: 3.02; SE: .90 percent. 

(- 10 ) 

Large fluctuations occurred over the period from 1959 through 1983 
in the expected real return from the stock market, not matched by 
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corresponding changes in the rate of change of consumption. The 
monthly results for the stock market strongly confirm the results from 
7-month changes in the earlier study with the Livingston data. 

All the results for monthly data show negative serial correlation of 
the first difference of consumption after the small adjustments for the 
effects of changes in the expected real interest rate and after the 
orthogonalizing transformation. This negative serial correlation sug¬ 
gests that there is a transitory element in monthly consumption that is 
not accounted for by the model. Similar but weaker evidence is found 
for quarterly data but not for annual data. 


X. Results Based on Postwar Quarterly Data 

Summers (1982) presents results to support the view that the inter¬ 
temporal elasticity of consumption is substantial. In a subsequent pa¬ 
per (Summers 1984), he has cited his findings in making a case for the 
interest elasticity of saving: “available evidence tends to suggest that 
savings are likely to be interest elastic. I find in the more reliable 
estimates in my working paper [Summers 1982] values of the inter¬ 
temporal elasticity of substitution which cluster at the high end of the 
range Evans and I considered [above one]. Similar estimates are 
found ... by Hansen-Singleton. Where investigators find low esti¬ 
mates of intertemporal elasticity of substitution, it is usually because 
of the difficulty in modelling ex ante rates of return on corporate 
stock” (p. 252). 

1 have not tried to duplicate Summers’s findings exactly. With post¬ 
war quarterly data on consumption and time averages of the real 
after-tax yield on Treasury bills computed as described in Section IV, 
I have obtained the following estimate of cr using the same inappro¬ 
priate instruments as Summers, namely the real yield, the inflation 
rate, and the rate of change of consumption dated / - 1 and t - 2, 
and without transforming for the serial correlation induced by time 
aggregation: 

estimate of o: .34 D-W: 1.95; SE: .87 percent. 

(-13) 

However, use of the Hayashi-Sims estimator and deletion of the in¬ 
struments known to be correlated with the disturbance reverses the 
finding of an unambiguously positive cr. 

estimate of a: ,10 D-W: 2.49; SE: .88 percent. 

(.23) 
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My investigation has shown little basis for a conclusion that the behav¬ 
ior of aggregate consumption in the United States in the twentieth 
century reveals an important positive value of the intertemporal elas¬ 
ticity of substitution. All investigators have agreed that the covariation 
of stock market returns and consumption did not suggest that con¬ 
sumption rises more rapidly in times of high expected real returns in 
the stock market. Earlier evidence based on interest-bearing securities 
such as Treasury bills had suggested values of a as high as one. How¬ 
ever, use of appropriate estimation techniques taking account of time 
aggregation reverses this finding. Moreover, extension of the investi¬ 
gation to prewar years and to data from the past few years 
strengthens the evidence that periods of high expected real interest 
rates have not been periods of rapid growth of consumption. 
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Transactions Costs and Covered Interest 
Arbitrage: Theory and Evidence 


Kevin Clinton 

Bank of Canada 


The extent to which deviations from covered interest parity can be 
attributed to transactions costs has been exaggerated in the eco¬ 
nomic literature because the swap market in foreign exchange has 
been ignored. It is shown that such deviations should be no greater 
than the lowest of the transactions costs in one of three markets: the 
swap market or either of the two relevant securities markets. This 
reconciles the theory with the data, which show spreads of no more 
than a few basis points. However, the empirical results have no direct 
bearing on the conventional market efficiency hypothesis. 


I. Introduction 

This paper clarifies the theory of transactions costs in covered interest 
arbitrage and corrects a crucial omission in previous descriptions of 
the market arrangements, 1 namely the neglect of the swap market in 
foreign exchange. Correction of this lapse greatly reduces the neutral 
zone* of deviations from covered interest parity (CIP) that can be 
attributed to transactions costs. 

The paper also presents empirical estimates of transactions costs 
and of deviations from CIP, together with estimates of the extent 


My thanks go to Steven Beal for able research assistance and to David Longworth for 
useful suggestions. The views expressed in this paper are those of the author alone and 
are not necessarily the views of the Bank of Canada. 

1 Much of the literature has appeared in this Journal. The main contributions are 
Frenkel and Levkh (1975,1977,1981), Deardorff (1979), Callier (1981), and Bahmani- 
Oskooee and Das (1985). 

* The neutral zone is that zone of covered interest differentials within which covered 
movements of funds do not yield profits net of transactions costs. 

{ Jm m ut of Potitiai Economy, 1988, vol. 96, no. 2] 
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of profitable trading opportunities from covered arbitrage in the 
Euromarket, calculated from a new set of daily observations in which 
careful attention has been given to accurate timing. Deviations from 
CIP, for currencies not subject to official controls, are small enough in 
this new data set to support the assumption of CIP, which is a power¬ 
ful simplifying assumption often made in models of exchange mar¬ 
kets. These data thereby confirm the abundant anecdotal evidence 
from market participants that opportunities for earning even as much 
as 0.25 percent (gross) are quite exceptional (e.g., Stigum 1978, p. 
136). But the observed deviations nevertheless frequently fall outside 
the calculated neutral zone, even when a wide margin is made for 
measurement error. This is evidence of a more or less regular flow of 
opportunities to make deals on which the return exceeds transactions 
costs. 

These findings help one to understand why most transactions in 
covered interest arbitrage take place at gross spreads that are negligi¬ 
ble for many economic purposes. However, the results provide no 
evidence either way on the market efficiency hypothesis, which is very 
difficult to test against data on covered interest differentials. 

II. Foreign Exchange Swaps 

To begin, it may help to specify what is meant in exchange markets by 
a swap.* A swap is a transaction in which one maturity of foreign 
currency is exchanged for another; in the present discussion we will 
be concerned only with swaps of spot exchange for some short-term 
forward maturity. For example, a U.S. bank wishing to invest in a 
covered deutsche mark asset for a period of time can, in a single 
transaction, swap into the necessary marks spot and out of them for¬ 
ward. There has long been a deep interbank swap market in foreign 
exchange for maturities of up to a year in the major currencies. Banks 
quote a bid and ask price, known as the swap rate, which is nothing 
more than the forward premium or discount on the relevant ex¬ 
change rate. 

Most of the functions that economists somewhat loosely ascribe to 
the forward market are in fact performed in the swap market. Logi¬ 
cally, there cannot be independent markets in both swaps and out¬ 
right forward exchange, and in practice there is no organized inter¬ 
bank market for outright forward exchange like that for swaps. Given 


s The swap facility existed in foreign exchange long before the recent ballooning in 
other financial swaps. Shepherd (195S, 1980, 1984), Stigum (1978), and Kubaryeh 
(1983) provide descriptions of the functioning of the exchange market swap. See also 
Mahajan and Mehta (1986). 
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the spot price of foreign exchange, the forward price is calculated 
from the swap rate. 4 This description of the markets is well illustrated 
by the fact that trading-room information services carry bid and ask 
quotes for spot exchange and for swaps but no quotes for outright 
forward exchange as such. 

This says nothing about causality in the determination of spot and 
forward exchange rates. In modern theories of exchange rate deter¬ 
mination, causality goes from the expected future spot exchange rate 
(equal to the forward exchange rate adjusted for any risk premium 
that may exist) to the current spot rate (via the interest rate differen¬ 
tial), and it is convenient to abstract from the institutional setup. From 
the viewpoint of causality, the institutional details are potentially con¬ 
fusing and not too important. But they are of central importance for 
the influence of transactions costs on covered interest spreads. Thus it 
is wrong to assume that the cost of the foreign exchange transactions 
in a covered investment is equal to the transaction cost of an outright 
spot plus that of an outright forward operation. The relevant transac¬ 
tion cost is simply that of a swap. 

As an incidental point, it follows that the transaction cost of out¬ 
right forward exchange equals the transaction cost of spot exchange 
plus that of the swap operation required to obtain the desired matu¬ 
rity. This explains why bid-ask spreads in outright forward exchange 
are always greater than those in spot exchange. 

III. Theoretical Bounds on Divergence 
from Interest Arbitrage 

Those transactors with the lowest transactions costs determine the 
maximum limits on divergences of prices from parity that are created 
by such costs. Here it is easiest to think of the least-cost arbitrageur as 
a Euromarket bank. It can be shown on the basis of arguments 
originated by Deardorff (1979) and Callier (1981) that the effective 
constraints in covered interest arbitrage are given by the lower of two 
possible bounds. The first is implied by Deardorff’s concept of “one¬ 
way arbitrage”; the second is implied by regular covered arbitrage. 

I use the following notation: t, is the transaction cost in spot ex¬ 
change, t w that in forward swaps, t that in domestic currency assets, 
and t* that in foreign currency assets; S is the spot price of foreign 
currency and w the swap rate (forward premium) of a given term on 


4 For example, Kubarych (1983, p. 32) says that "interbank swap rates... provide the 
main guide for setting a rate for an outright forward transaction.” Forward rate time 
series used by economists are derived by adding spot and swap rates. 
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foreign currency. The costs 4 , t, and t* and the swap rate w are 
expressed in the same units as the interest rates on domestic (i) and 
foreign (»*) currencies. The variables i, i*, w, and S all represent rates 
at which trades are actually made. The parity forward premium, tt>„, is 
the one that would exactly equate covered interest rates in the absence 
of transactions costs. On short-term assets it is closely approximated 
by the interest differential so that w 0 = i - i*. 

One-way arbitrage then implies bounds on deviations from parity 
given by 

\w - U»o| « t + t* - t w . (1) 

Inequality (1) is identical to Deardorff’s proposition 2, given that t F 
= 4 + 4 where 4 is the transaction cost of an outright forward 
transaction. A nontrivial equilibrium is possible only if 4 , « / + t*. 
Otherwise transactions costs in the swap market are prohibitively 
high, and dealers would not use the market. 

Covered arbitrage can be thought of as an option either to reduce 
deposit costs 5 6 or to secure a higher return on short-term assets. The 
process imposes bounds on deviations from CIP that can be derived 
by following the same steps as Callier. The bounds are given by the 
inequality 

\w - Wo | 4 . - |<* - t\- (2) 

Within the zone defined by (2), all banks prefer at the margin to 
match their activities currency by currency, without doing covered 
interest arbitrage. Because 4 , = 4 - t.„ the bounds defined by (2) 
would always prevent the bounds derived by Callier, 4 + I, - |(* - t\, 
from binding.® 

For a nontrivial equilibrium in all relevant markets simultaneously, 
it is necessary and sufficient that the following conditions hold jointly: 


4. ^ t + t*, (3a) 

t t* + 4 ., (3b) 

t* « t + 4 ,. (3c) 

Taken together, (1) and (2) imply 

|zt> - tt'ol min(< + t* - 4 ,, 4 , - |f* - /|). (4) 


5 This option would appear to be typical in the interbank market. Stigum (1978, p. 
136), e.g., says that “to any funding officer, every Eurocurrency deposit is nothing but a 
Eurodollar deposit with a swap transaction lagged on." 

6 Bahmani-Oskooee and Das (1985), evidently unaware of Callier’s contribution, 
later published the same inequality. 
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Maximum deviation 
from covered 
interest parity 



A neat result follows immediately from (4): the upper limit on the 
neutral zone can be no greater than the transaction cost in the market 
with the lowest transaction cost. 

It is worth giving the derived inequalities some economic interpre¬ 
tation. In (1) the maximum divergence allowed by transactions costs 
from forward parity decreases one for one as the transaction cost in the 
swap market increases. The reason for this counterintuitive property, 
as explained by Deardorff, is that with higher swap transactions costs 
there is less room for divergences from forward parity. At the 
threshold, t w = t + t*, there is no room at all: any deviation from 
forward parity or any increase in t w , other things being equal, would 
eliminate the swap market. The more intuitive property that in¬ 
creased costs of swaps increase possible deviations from parity is cap¬ 
tured in (2). 

Figure 1 illustrates some properties of (4), for a fixed t* and t, 
under the supposition that <* > t. The absolute value of the deviation 
from covered parity is plotted on the vertical axis as a function of t u „ 
which varies along the horizontal. The only admissible values per¬ 
mitted by inequalities (3a)-{3c) for t w are between t w - t* - t and t w 
* t + <*, and the solid lines give the relevant upper limits on the 
neutral zone for |w - u/ 0 |- The peak of the neutral zone occurs at the 
intersection at which t w = t*. At this point the upper limit on the 
deviation from interest parity is t, which is less than both t w and t*. 
This illustrates the result that the lowest of the three transactions costs 
caps the neutral zone. 
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Currency 

Memo:’ 
Spot I, 

90-Day 
Forward 
Swap <„ 

Eurocurrency 
Deposit t* b 

/• + <-/„ 

Theoretical Limits 
from Inequality (4) 
for |bi - und 

Canadian dollar 

.028 

.054 

.0625 

.071 

.054 

Deutsche mark 

.023 

.034 

.0625 

.091 

.034 

French franc 

.039 

.105 

.177 

.135 


Japanese yen 

.031 

.038 

.0625 

.087 

.038 

U.K. pound 

.035 

.064 

.0625 

.061 

.061 

U.S. dollar 



.0625 




Note. —Transactions costa arc the mean bid-ask spread divided by two. Percentage rates are per annum. 
* Not annualized. Not relevant to interest arbitrage. 

6 The transaction cost for the U.$. dollar u denoted by t. 


A 


IV. Evidence 



A. The Measurement of Transactions Costs 

As pointed out by McCormick (1979), upward bias in transaction cost 
estimates can arise from the triangular parity method originated by 
Frenkel and Levich (1975). This can be avoided if estimates of trans¬ 
actions costs in all markets are taken directly from bid-ask spreads. 
The simplest assumption is that the transaction cost parameter is 
equal to one-half of the posted bid-ask spread 7 since the spread itself 
is what is given up in the two transactions of a “round-trip.” A priori, 
bias in this measure might be positive or negative. Upward bias might 
arise since posted spreads are usually greater than the spreads at 
which deals are actually made. Downward bias might arise since some 
costs might not be covered by the spread. In a competitive market, 
however, one would expect actual spreads to reflect quite closely those 
dealer costs that vary with transaction size, most notably those that 
arise from price volatility. Fieleke (1975) and Overturf (1982), for 
example, established that exchange rate variance is an empirically 
significant determinant of bid-ask spreads. Brokerage fees are negli¬ 
gible in wholesale exchange markets relative to these spreads. 8 

Estimates are presented in table 1 of Euromarket transactions costs 
in five foreign currencies: the Canadian dollar, the deutsche mark, 
the French franc, the Japanese yen, and the U.K. pound. The United 

7 In contrast, Frenkel and Levich multiplied the security market spread by 1.25 
instead of 0.5, following Demsetz (1968). It is not clear how relevant Demsetz's estimate 
(for the New York Stock Exchange, which does a lot of retail trade) is to money and 
currency markets. The brokerage fee in his data is much higher than that charged in 
the latter markets. 

8 For major foreign currencies the brokerage fee is on the order of $25 or less per $ 1 
million. 
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States is presented as the domestic country, and all exchange rates are 
measured as the foreign currency price of one U.S. dollar. For direct 
comparability with interest rates, all costs relevant to interest arbitrage 
are written in annual percentage rates.” The data were taken from 
midmorning quotes on the Reuter Money Rates Service from Novem¬ 
ber 1985 to May 1986, a period with a fair degree of exchange marke' 
turmoil. 

The average values for bid-ask spreads in the spot market indicate a 
mean transaction cost ranging from 0.023 percent for the mark to 
0.039 percent for the franc. These estimates are much lower than 
Frenkel and Levich’s for the floating rate period, which were around 
0.50 percent. In the light of McCormick’s demonstration of the errors 
that can be caused by mistiming, 10 one might suspect the latter esti¬ 
mate to be severely biased upward. 

As regards the results that apply to interest arbitrage, mean esti¬ 
mates for transactions costs in 90-day swaps, range from 0.034 
percent per annum for the mark to a high of 0.105 percent for the 
franc. Posted bid-ask spreads on five of the six Eurocurrency 90-day 
deposits are invariably 0.125 percent, 11 which implies a transaction 
cost of 0.0625 percent. ia 

For the franc the deposit spread changed significantly from day to 
day over the sample period and was on average much higher than 
0.125 percent. This could well reflect the capital controls and exten¬ 
sive official market intervention of the French authorities. However, 
it is also likely that the Eurofranc transaction cost is overestimated 
because condition (3b), which is necessary for a viable Eurofranc mar¬ 
ket, is broken. Although the Eurofranc market is in fact less active 
than that of the other five currencies, it is nevertheless implausible 
that the franc transaction cost is prohibitively high. 

B. The Size of the Neutral Zone 

For all currencies other than the franc, inequality (4) reduces to 

|w - u> 0 | ** min(.125 - t w> t w ). (5) 

9 All variables could be expressed in quarterly rates with no effect on the argument. 

10 From his most precisely timed data, McCormick estimated a transaction cost for 
spot exchange from triangular parity deviations of 0.090 percent. 

11 Note that Johnston (1982) says that for the Eurodollar “typically prime name 
banks would lend funds between themselves for margins as tow as 1/16 (and sometimes 
1/32) of 1 per cent” (p. 16). For the Canadian dollar a spread of 0.25 percent is 
commonly posted by convention, but the effective spread is never greater than 0.125 
percent. 

** Frenkel and Lcvich calculated a transaction cost equivalent to 0.125 percent per 
annum from bid-ask spreads, which is twice as high as the estimates above, apparently 
for reasons discussed in n. 7. 
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Estimates of the average value of these limits are given by the lower of 
the two figures in the second and fourth columns of table 1, except 
for the franc. The bounds on the neutral zone are repeated for conve¬ 
nience in the fifth column. These esumates imply that covered inter¬ 
est arbitrage rather than one-way arbitrage normally imposes the ef¬ 
fective limits on the neutral zone. 

The most important result is that transactions costs should not give 
rise to deviations from CIP in excess of about 0.06 percent per annum 
between well-traded currencies. This is substantially lower than ear¬ 
lier estimates for the floating exchange rate period. The essential 
reason for this is the introduction of the swap market into the model 
of arbitrage, not the different empirical cost measures used. Estimates 
of the swap transaction cost implied by the condition = t F - t s from, 
for example, Frenkel and Levich (1977, 1981) lie in a range similar to 
those presented in table 1. 

C. Hypothesis Tests 

It is doubtful that the hypothesis of market efficiency (“no unex¬ 
ploited profits”) can be given a rigorous test for covered interest arbi¬ 
trage with data so far available. Frenkel and Levich’s tests have very' 
little power because of their probable overestimates of the width of 
the neutral zone. More fundamentally, it is neither necessary nor 
sufficient for efficiency that deviations from CIP lie within the neutral 
zones that can be constructed from existing data because it is not 
known if the prices of transactions services (e.g., as reflected either in 
bid-ask spreads or in triangular arbitrage deviations) are efficiently 
determined, and it is not known if the calculated costs properly incor¬ 
porate all the longer-run opportunity costs of quasi-fixed factors. 
Since the theory indicates that very fine tolerances are involved, a 
valid test of the efficiency hypothesis demands a precision of the 
empirical estimates that has not as yet been achieved. 

A more modest hypothesis, which turns out to be fruitful, is there¬ 
fore explored in this section: that there are no returns in excess of 
those transactions costs represented by posted bid-ask spreads. I 
define such returns to be “profitable trading opportunities." 

Before I proceed, measurement error must be taken into account. 
Point estimates of deviations from CIP calculated from posted bids 
and asks will not in general be equal to the actual deviations on either 
the bid or the ask side because, as noted in Section IVA, the quotes 
define only a range within which trades may take place. A simple 
adjustment to the neutral zone allows a generous margin for this 
measurement error. Let the variables i, i*, and w be interpreted as the 
midpoints of observed bid-ask ranges so that iti * w — wo is the 
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TABLE 2 

Summary Statistics for Deviations from CIP Measured at Midpoint 
(Percent per Annum; Daily Data. November 21, 1985-May 9, 1986) 


Currency 

ir < 


it v 

if s 

LCR* 

V . 

or 

*2 

a 

Vs 

*s 

Canadian dollar 

-.013 

.118 

-.010 

.074 



3 

Deutsche mark 

-.040 

.065 

-.009 


.016 


5 

French franc 

.022 

.233 

-.048 

.155 

jinr 

.112 

4 

Japanese yen 

.016 

.093 

-.004 


gjfliy. « 


3 

U.K. pound 

-.001 

.077 

-.055 

.048 

Bifll 

.031 

3 


Note.—R aw (*T|) and net of tranwtction* cows («*); for w* > 0; otherwise wa * 0. 

• I-ongrti consecutive run of profitable trading opportunities m a given direction. 


deviation from CIP measured at midpoints. Then the bids may be 
written as * + / and so forth and the asks as t — t and so forth. Now 
suppose that the actual rates lie somewhere between the bid and the 
ask rate, that is, between 2 ± t, between i* ± t*. and between w ± t w . If 
there are no profitable trading opportunities, the following inequality 
must hold: 

|tti| =£ min[2(/ + t*), 2t w ]. (6) 

The bounds on measured deviations from parity defined by (6) are 
given by the lower of (a) the sum of bid-ask spreads on the two 
Eurocurrencies or ( b) the spread on a swap. Profitable arbitrage op¬ 
portunities are indicated when the raw deviation falls outside that 
bound. Thus if ir 2 is the net trading profit on a transaction, 

ir 2 = |tti| - min[2(/ + t*), 2 t w ], (7) 

then when ir 2 > 0 a profitable trading opportunity exists. 

The extent of both raw and net deviations from parity can be 
gauged from the summary statistics on ttj and ir 2 in table 2. The data 
set on which this table is based contains just over 100 daily observa¬ 
tions on the five foreign currencies and the U.S. dollar; great care was 
taken to obtain synchronous observations. 18 The bar over itj and ir 2 
indicates the sample mean, and o is the sample standard deviation. 
The estimates of ty, (and t* in the case of the franc) used to calculate ir 2 
vary from day to day. 

The mean deviation adjusted for transactions costs, tf 2 , is negative 
for all currencies, implying that no profitable arbitrage opportunity 
can be expected at a randomly chosen moment. 14 Nevertheless, the 

13 See the Appendix for details on the data set. 

14 It might be noted that there is no estimation problem caused by the sampling 
interval (1 day) being shorter than the forward contract interval (3 months). The 
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TABLES 

Deviations from CIP (ir,) 


3°/ 


Currency 

Percentage 
of Observations 
Outside the 

Neutral Zone* 

Percentile 

Boundaries 4 

50% 

95% 

Canadian dollar 

37 

.08 

.22 

Deutsche mark 

34 

.04 

.16 

French franc 

34 

.11 

.49 

Japanese yen 

36 

.06 

.18 

U.K. pound 

15 

.05 

.15 


• That is, for ir«i > 0. 

f Percentage of deviations that lie between minus and plus the numhrr shown (percentage per annum). For 
example. 50 percent of the deviations for the Canadian dollar lie between -0.08 percent and +0.08 percem. 


proportion of observations outside the neutral zone, presented in 
table 3, is sizable for each currency, varying from 15 percent for the 
pound to around 35 percent for the other currencies. This is a clear 
refutation of the hypothesis that there are no profitable trading op¬ 
portunities. 15 

Some characteristics of profitable trading opportunities can be dis¬ 
cerned from a trading rule based on ir 2 , that is, to trade only when ir 2 

> 0. With this rule, the hypothetical sequence of returns per dollar 
transacted is given by the absolute value of tt 3 , where ir 3 = ir 2 for ir 2 

> 0 and ir 3 = 0 for ir 2 « 0. Two measures of the average hypothetical 
net return are presented: the mean of ir 3 over the whole sample (tt 3 ) 
and the mean over just those observations when a profitable transac¬ 
tion is indicated (ir 3 ). In the last column, LCR is the longest consecu¬ 
tive run of profitable trading opportunities in a given direction. The 
measures of average returns are all quite low, and while the longest 
sequence of profitable trading opportunities is five observations, the 
most common run does not extend beyond a single observation. 
Thus, in general, profit opportunities appear to be both small and 
short-lived, even though they are not rare. Moreover, the fiftieth and 
ninety-fifth percentile boundaries from the sample distribution of it|, 
also presented in table 3, indicate that an overwhelming majority of 
measured raw deviations from CIP for currencies other than the 
franc are less than about 20 basis points. 


estimation problem for which Hansen and Hodrick (1980) devised a solution applies to 
speculative operations in which forecasts of returns overlap sampling intervals. No such 
problem arises in arbitrage because returns are known with certainty the minute the 
deal is struck, regardless of the length of the contract. 

IS Calculations with a monthly data set, of quality similar to that used by previous 
researchers, revealed larger deviations from CIP, i.e.,more opportunities for profitable 
transactions. 
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These results do not necessarily imply the existence of any excess 
returns, despite the significant percentages of profitable trading op¬ 
portunities. To illustrate this point, suppose it is true that raw devia¬ 
tions from CIP do not exceed the total costs of all factors employed in 
arbitrage, including those devoted to information gathering that are 
not embodied in bid-ask spreads, and hence do not incorporate any 
excess returns. Then it can be inferred from (6) that the average total 
cost of arbitrage, broadly defined, is about one-half of the maximum 
observed raw deviation from CIP. On the basis of the “95 percent 
rule” that Frenkel and Levich used for removing measurement error, 
implied average costs would be one-half of the ninety-fifth percentile 
bounds in the final column of table 3. This would suggest that average 
total costs in 90-day arbitrage range from 0.08 percent per annum for 
the pound and the mark up to 0.25 percent per annum for the franc. 
These broad measures of the costs of arbitrage services are not so 
high as to be implausible. Indeed, since they are very much lower 
than previous estimates, it is difficult to believe that excess returns are 
any significant part of the arbitrage process. 


V. Conclusions 

Transactions costs create much less room for deviations from CIP 
than has been suggested in the economic literature. In consequence, 
the claim that transactions costs account entirely for observed devia¬ 
tions from CIP in the Euromarket can be rejected. 

This paper derives the strong theoretical result that deviations 
from parity attributable to transactions costs cannot exceed the single 
lowest transaction cost in one of three markets: that on an exchange 
market swap or that on a security in one of the two currencies in¬ 
volved. If transactions costs are approximated by 0.5 times the bid-ask 
spreads prevailing in late 1985 and early 1986, then the bounds on 
the neutral zone of covered interest differentials between the U.S. 
dollar and the five Eurocurrencies studied here would be within 
±0.06 percent per annum from parity. 

The new data set on daily bid and ask quotes does yield smaller 
deviations from CIP than previously available data sets do. For the 
four currencies not subject to capital controls, the overwhelming ma¬ 
jority of raw deviations calculated from the new data occur within 20 
basis points of parity. The true unobserved deviations, realized in 
actual trades, would lie within an even narrower range since dealers 
normally dicker for finer spreads than those posted. However, even 
with generous allowance for measurement error, the distributions of 
deviations are by no means entirely contained within the calculated 
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neutral zones, and so the hypothesis of no profitable trading opportu¬ 
nities is rejected. It might nevertheless be argued that, empirically, 
profitable trading opportunities are neither large enough nor long- 
lived enough to yield a flow of excess returns over time to any factor. 

The results are consistent with a market in which there, are no 
excess profits but in which, because of information costs, prices do not 
immediately reflect all available information. 16 However, the results 
shed no light on the empirical validity of the efficiency hypothesis, 
which has yet to be stated in terms that allow meaningful tests against 
data on covered arbitrage. Not that this need cause concern because 
the finding that deviations from CIP are very small and transitory is 
itself sufficient to justify the useful simplifying assumption of CIP in 
models of exchange markets, regardless of whether covered interest 
arbitrage is efficient (by some definition) or not. For all practical pur¬ 
poses, the latter question is therefore a side issue. 

A final conclusion is that Deardorff’s paradox—that one-way arbi¬ 
trage seems to nullify any opportunity for covered interest arbi¬ 
trage—is resolved. Once the role of the swap market in foreign ex¬ 
change is properly modeled and bid-ask spreads in the relevant 
markets are compared, it becomes clear that regular covered interest 
arbitrage will in general imply lower bounds on deviations from CIP 
than those implied by one-way arbitrage. 


Appendix 

Daily Data 

The data come from Reuter Money Rates Service trading-room screen, mid¬ 
morning, eastern standard time, November 21, 1985-May 9, 1986. 

The swap rate, at an annual percentage rate, is calculated as 

, - 36,000 W„ + W b 

n S„ + St ’ 

where W a and W b are, respectively, the ask and bid swap rates, S„ and S b are 
the ask and bid spot rates, and n is the number of days in the contract. The 
latter is determined by the procedures described in Longworth, Boothe, and 
Clinton (1983, app. A), and in the present sample it varies between 88 and 95 
days. 

The swap transaction cost is calculated as 

= 36,000 Wa ~ W b 

* n S a + S b ' 

The data are available from the author. 

18 This might be an empirical analogy to the kind of market equilibrium described by 
Grossman and Stigliu (1980). 
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Why Have Some Farmers Opposed Futures 
Markets? 


B. Peter Pashigian 

University of Chicago 


A self-interest explanation is presented for opposition by some farm 
groups to futures markets. During the twenties and thirties political 
opposition to futures markets was greater in the grain-producing 
states. The opposition was centered in Minnesota, North Dakota, 
South Dakota, Montana, and a few other states. The line elevator 
companies were prominent in these states and not others and used 
futures prices to facilitate a buying cartel. Futures prices were used 
to derive a suggested buying price for elevator purchases in each 
local market. The political opposition to futures by farmers was de¬ 
signed to raise the cost of operating local cartels. Political opposition 
was greater and gross protit margins of elevators were higher in 
states with line elevators. 


I. Introduction 

Farmer opposition to futures markets throughout the last quarter of 
the nineteenth century and the first three decades of this century is a 
conundrum. Farmers should benefit from futures directly by hedging 
or indirectly when middlemen hedge their purchases from farmers. 
Grain elevators have long hedged their wheat purchases from the 
farmer. If the use of futures was prohibited, these elevators would 
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have had to resort to less efficient hedging substitutes and would 
thereby reduce the derived demand for the wheat. Prohibiting the 
use of futures would have the same qualitative effect on the farmer as 
prohibiting the paving of rural roads. 

This paper reconsiders the role of self-interest in explaining why 
farmers opposed futures markets. It detects opposition to futures 
markets through the political market, by Senate votes supporting 
legislation that opposed futures markets. Three votes on bills to ban 
or to tax futures taken in the twenties and thirties are examined, first 
by region and then by state. The votes trace out an interesting pat¬ 
tern. Political opposition to futures markets was concentrated in the 
West North Central region, the grain-producing region of the coun¬ 
try. It was not evenly distributed within this region but was greater in 
the states of the Northwest Territory, Montana, North Dakota, South 
Dakota, Minnesota, and a few other states. This paper offers an ex¬ 
planation for this unusual geographical pattern of political opposi¬ 
tion. 

Section II of the paper presents the votes on futures legislation. A 
cartel theory for why the opposition to futures was centered in se¬ 
lected wheat-producing states of the Northwest is presented in Sec¬ 
tion III. Some supporting qualitative evidence and tests of the cartel 
theory are presented in Section IV. 

II. Legislative Proposals to Tax or Prohibit 
Futures 

Throughout the late 1920s and the 1930s the Senate voted on propos¬ 
als to change the tax on or to ban the use of futures (Cowen 1965). 
The bills or amendments were introduced in 1927, 1928, and 1938 
and would have either banned or changed the tax on futures transac¬ 
tions. Each proposal was defeated. These proposals are of interest 
because they focused exclusively on futures transactions. 1 Political 
support for the taxation of futures is measured by the percentage of 
votes favoring higher taxes or a ban. A brief description of these 
proposals is given in Appendix A. 

Table 1 shows the percentage of votes in each census region that 
supported the three proposals to tax or to ban futures (col. 1). Col¬ 
umn 2 shows each region’s percentage relative to the percentage for 
the West North Central region. The greatest opposition to futures 

1 The Senate also voted on the Black Amendment, which would have prohibited the 
use of futures by the Federal Farm Board while attempting to stabilize wheat prices. 
The West Nonh Central region strongly opposed any attempt to ban the use of futures. 
The support by wheat fanners for the use of futures was due to the generous prices 
offered by the Federal Farm Board and was in the self-interests of wheat producers. 
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TABLE 1 

Percent ace of Votes Supporting a Tax or Ban on Futures by Region 


Region 

Percentage of Votes 

Supporting Tax on Futures 

Percentage 

(1) 

Region 
Percentage 
Relative 
to West 
North Central 
(2) 

United States 

44 

.66 

New England 

10 

.15 

Middle Atlantic 

0 

.00 

East North Central 

33 

.49 

West North Central 

67 

1.00 

South Atlantic 

34 

.51 

East South Central 

20 

.30 

West South Central 

55 

.82 

Mountain 

30 

.45 

Pacific 

46 

.69 


came from the West North Central region, the grain-producing cen¬ 
ter of the country. Considerable though less support for these pro¬ 
posals came from the West South Central region, another region with 
a populist tradition. The commercial and industrial regions of the 
country mustered the strongest opposition to any restrictions on com¬ 
mercial transactions. 

Opposition to futures markets was not uniformly distributed 
throughout the states in the West North Central region but was con¬ 
centrated in the states of the Northwest Territory. Table 2 shows the 
percentage of votes favoring a tax or a ban on futures for each of the 
14 grain-producing states. Support for a tax on futures was greater 
throughout the states of the Northwest, Nebraska, and Wisconsin and 
lower in the older grain-producing states, where futures were used 
less often for hedging purposes and where the commercial line 
elevators were relatively unimportant (col. 2). A theory of farmer 
opposition to futures should be able to account for this voting pattern. 


III. A Self-Interest Theory of Fanner Opposition 
to Futures Markets 

If elevator services are competitively priced, farmer interests are best 
i served and land rents maximized when the cost of marketing the crop 
is minimized. Farmers would then have little interest in banning the 
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TABLE 2 


Votes on Bills to Tax Futures and Market Share ok Line Elevators in 
Grain-producing States 


States 

Percentage of 

Votes 

Supporting 
Taxon Futures 

Market Share 
of Commercial 
Line Elevators 

States of the Northwest Territory: 

Minnesota 

60 

46 

North Dakota 

100 

54 

South Dakota 

100 

43 

Montana 

80 

50 

Mean 

85.0 

48.3 

Other states west of Mississippi: 

Nebraska 

UK) 

49 

Kansas 

60 

9 

Iowa 

33 

29 

Oklahoma 

60 

21 

Missouri 

0 

11 

Mean 

50.6 

23.8 

States east of Mississippi: 

Wisconsin 

100 

14 

Illinois 

0 

30 

Indiana 

0 

32 

Michigan 

33 

27 

Ohio 

20 

22 

Mean 

80.6 

25.0 


Sower..—U-S. Federal Trade Commission 0920, VI>! |, app. lable 2) 


use of futures by elevators. 2 They would be harmed by banning the 
use of futures and thereby by raising the cost of marketing the crop. 
Farmer support for such a ban could be rational if (1) the elevator 
market was not competitively organized and if (2) futures prices were 
used by line elevators to facilitate a buying cartel. 

Under these assumptions, wheat farmers could have benefited if 
futures were banned. The second assumption implies that a ban 
would raise the cost of collusion, increase competition among 
elevators (at a single station or between stations), and raise the price 
received by the farmer. On the other hand, a ban on the use of 
futures would reduce elevator derived demand for the local grain 
crop and harm farmers because elevators would have to turn to less 
efficient hedging mechanisms. 

The two effects of a ban are shown in figure 1, where DD represents 
the derived demand for wheat by a single or a few local elevators. The 

* Farmers might favor a ban on futures if they had better private information, whose 
value would diminish when futures markets are allowed to operate. This issue is dis¬ 
cussed in Pashigian (1987). 
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supply curve of local wheat is SS, and MC is the marginal cost of wheat 
to the buyers. Acting together, elevators purchase OQ bushels of 
wheat at a price of P. Suppose that a prohibitive tax is imposed on 
futures. By assumption the tax raises the cost of collusion (as de¬ 
scribed more fully below) and the hedging costs of the elevator. The 
derived demand for the local wheat crop declines to D'D’ because the 
elevators now use less efficient hedging methods. If the cartel col¬ 
lapses because of the ban, the price received by farmers rises from P 
to P' and OQ' bushels are purchased. Obviously, the argument is still 
valid, even though the cartel does not completely collapse, as long as 
the quantity purchased increases. Farmers could be harmed if the 
inefficiency effect dominates the monopsony effect. If the cartel argu¬ 
ment explains the votes cast by senators from the Northwest, it im¬ 
plies that the monopsony effect dominated the inefficiency effect. 

IV. Evidence end Tests of the Cartel Hypothesis 

The local elevator markets in the states in the Northwest did differ 
^otn those in other grain-producing states. Most stations had one or 
two elevators. Though there were ipore elevators per station in the 
Northwest, the distance between stations was longer and the railroad 
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network was less dense in these states than elsewhere, so fewer alter¬ 
native sources of distribution were available to fanners in the North¬ 
west. More important, commercial line elevators accounted for a 
larger share of all elevators there than elsewhere. The economic and 
structural characteristics of elevator markets were more conducive to 
monopsony in these states than elsewhere.* 

The problem of coordinating prices between stations is reduced but 
not completely solved by the presence of line rather than indepen¬ 
dent single-unit companies. The problem of coordinating buying 
prices among the line companies and the independents still remains. 
In a market with daily changes in terminal prices, a buying cartel 
could not succeed by simply posting a fixed buying price over long 
periods of time. A fixed buying price would be inflexible and would 
be disregarded during times of rapid change in the cash or futures 
prices. A successful cartel would require an arrangement that would 
allow for daily changes in local buying prices and yet facilitate price 
agreements among the line companies. 

The line companies issued price lists to their buyers in the local 
markets. The Grain Bulletin Card was subsidized by 18 large line 
elevator companies and supplied information about suggested buying 
prices to most of the elevators located throughout the Northwest. 3 4 

There was one interesting difference between the Grain Bulletin 
and the cards sent by the price-reporting services based in Kansas City 
to elevators located in other states. The Grain Bulletin did not supply 
the elevators with the latest terminal spot prices or “to arrive” prices, 
the prices for wheat delivered in 20 or 30 days. Rather, the daily card 
sent to each subscriber included a suggested buying price for wheat of 
each grade. The buying price would be derived after subtracting 
transport cost to the nearest terminal market and a “normal” margin 
for the elevator from either the “to arrive” or futures price quoted in 
the Minneapolis market. Sometimes the “to arrive” price was used, 
and in periods of rapidly changing prices, the futures price served as 
a reference price. 5 If futures markets had been banned, suggested 
buying prices would have been derived from the "to arrive” price, an 

3 Political opposition by wheat farmers to futures markets has declined over time. In 
sharp contrast to the twenties and thirties, there was no support by the West North 
Centra] region in 1982 for a Reagan administration proposal to tax futures (Pashigian 
1986). The development of motor transportation and farmer cooperatives reduced the 
monopsony power of the elevators. These points are discussed more fully in Pashigian 
(1987). 

4 For a discussion of the Grain Bulletin , see U.S. Federal Trade Commission (1920, 
vol. 3, chap. 8). In 1912 the Grain Bulletin was sent to ail line elevators and 85 percent ol 
the independent and cooperative elevators (p. 216). 

9 An illustration of the use of the “to arrive" and the futures price is presented in U.S. 
Federal Trade Commission (1920, 3:213-16). 
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imperfect substitute. The Minneapolis “to arrive" market was a thin¬ 
ner market since most of the grain shipped to Duluth or Minneapolis 
was sold on consignment. Moreover, the availability of futures prices 
undoubtedly improved the functioning of the “to arrive” market. 6 

The cartel hypothesis predicts that a tax on futures would be 
favored in states in which the elevators had monopsony power. It was 
argued that the cost of collusion would be lower in states with line 
elevators. Hence, greater farmer support for a tax or a ban on futures 
should occur in states in which commercial line elevators had higher 
market shares. 

The monopsony hypothesis is tested by estimating a small four- 
equation model. The model includes three equations that explain the 
logit of the market share of line, cooperative, and independent 
elevators in a state, respectively, and a fourth equation that explains 
the logit of the proportion of a state's votes supporting the tax on 
futures. 7 The independent variables in the vote equation, which are 
of primary interest here, include a dummy variable that equals one if 
either the Chicago, Kansas City, or Minneapolis futures exchange is 
located in the state and the logit of the market share of the line 
elevators in a state. 

The seemingly unrelated regression results are presented in table 
3. The regression results for only the voting equation are presented. 
They show that political support for a tax on futures was lower if a 
major futures exchange was located in the slate. Futures exchanges 
undoubtedly opposed attempts to limit futures trading. Political op¬ 
position to futures transactions increases as the market share of com¬ 
mercial line elevators increases. This result suggests that farmers ex¬ 
ercised greater political opposition in states in which the line elevators 
had greater monopsony power. 

An implication of the cartel hypothesis is that the profits of 
elevators will be higher where the local elevator(s) had monopsony 
power. The gross margin, the difference between the cash selling 
price at the terminal market and the price paid to the farmer by the 
elevator plus transport cost to the terminal market, should be higher 
for elevators located in the Northwest (or for elevators shipping grain 
to the Minneapolis terminal market). 

Prices paid and received by the elevator and shipping charges to the 
terminal market were collected by the Joint Commission, a congres- 

6 One indication of the thinness of the "to arrive” market is that the closing “to 
arrive" prices on the Minneapolis market were estimated by taking the closing futures 
price and adding a normal premium or subtracting a normal discount. 

7 In several states all the votes cast either supported or opposed the legislative pro¬ 
posals. In these cases the data were adjusted by reducing the proportion to .999 trom 1 
or by raising the proportion of votes favoring a tax to .001. 
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TABLE 3 

Emcr of Market Share of Line Companies on Support for Futures Legislation 


Variable 

0) 

Constant 

4.11 


(2.4) 

Exchange 

-4.56 


(2.1) 

Logit of line companies’ market share 

2.94 


(2.2) 

Weighted mean square error of system 

.943 

Weighted R 2 of system 

.963 


Nort.—Mtatiukx art in parenthese* 


sional investigative committee; are available by type of grain for 
1912-13, 1915-16, and 1920-21 seasons; and are reported for 
elevators at 69 separate elevator locations scattered throughout the 
Midwest and the Northwest (U.S. Congress 1922). The commission 
obtained the buying prices for wheat, corn, barley, and oats at the 
country elevators, the closing same-day cash price at the terminal 
market to which the grain was shipped, and freight per bushel from 
the elevator to the terminal market. A “gross margin” of the country 
elevator was calculated by subtracting the sum of the price paid to the 
farmer and freight from elevator to terminal from the cash price at 
the terminal. If the line elevators had monopsony power, the gross 
margins of the elevators located in the Northwest, where most of the 
line elevators were located, would be higher than the gross margins of 
elevators located in other midwestem states, where the independents 
and some of the cooperatives were located and where more competi¬ 
tive markets are assumed to exist. 8 

Two methods were used to assign elevators to three classes. Under 
the first method each elevator was classified by the location of the 
terminal to which the grain was shipped; (1) Minneapolis (MT); (2) 
Midwest (MWT), which includes Chicago, St. Louis, Kansas City, 
Omaha, and Milwaukee; and (3) other (OT), which includes New 
York, Baltimore, Galveston, San Francisco, New Orleans, and Seattle. 
The second method assigned each elevator to one of three classes by 
the location of the country elevator: (I) Northwest Territory states 
(NWE), (2) other Great Plains states (GPE), and (3) other states 


* The cartel theory predict* higher margins for grains shipped to the Minneapolis 
terminal market or for elevators located in the states of the Northwest. Higher gross 
margins could be due to higher costs for elevators in the Northwest, e.g., lower elevator 
volume or more expensive sorting of wheat. The gross margin was not related to the 
average capacity of elevators in the state. 
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(OTE). Dummy variables were assigned to each class. For example, 
one of the regression equations is 

El-Marg = b 0 + (6,MWT + b 2 OT + A, 15 + 6 4 20) Wheat 

+ (coMT + ciMWT + cjOT + cjl5 + c 4 20)Corn 

+ (doMT + diMWT +d 2 0 T + d 3 15 + d 4 20)Oats 

+ («oMT + ejMWT + f 2 OT 4- esl5 + e 4 20)Barley, 

where El-Marg represents the elevator’s gross margin. The dummy 
variables inside the parentheses represent the location of the terminal 
market or the year the grain was shipped. The suppressed dummy 
variable is wheat delivered to the Minneapolis terminal market. The 
variables are defined more fully in Appendix B. 

Regression results are presented in table 4. Columns 1 and 2 show 
the coefficient estimates and ^-statistics when all observations (wheat, 
corn, barley, and oats) are pooled and the terminal dummy variables 
are used. Columns 3 and 4 show similar results when dummy vari¬ 
ables for the location of country elevators are used. Columns 5 and 6 
show the results when the market share of the commercial line 
elevators in the state in which the elevator is located is substituted for 
the terminal or elevator dummy variables. 

The results indicate that the gross margins of country elevators that 
shipped to the Minneapolis terminal market were larger. In column 1 
the coefficients of the elevators that shipped to the Minneapolis mar¬ 
ket were larger than the coefficients of elevators that shipped to other 
markets in six of the eight comparisons, while the coefficients of the 
dummy variables for country elevators located in the Northwest were 
larger than the comparable coefficients for elevators located else¬ 
where in all five of the comparisons (col. 3). Finally, column 5 shows 
that elevators located in the states in which the line elevators had 
higher market shares had larger gross profit margins. 

An F-test was performed to determine if the difference between 
the coefficient of the Minneapolis terminal dummy variable and the 
comparable coefficient for the other terminal variables is significant. 
Similar comparison tests were performed for the coefficients of the 
elevator dummy variables. Eight pairwise tests were made when the 
terminal dummy variables were used, and five of the eight differences 
between coefficients were significant at the 5 percent probability level. 
When the elevator dummy variables were used, the results were not as 
strong, with two of the five differences significant at the 5 percent 
probability level. 9 As mentioned above, in six of the eight cases in 

9 Only four of the 69 locations involve a shipment of wheat from one of the states of 
the Northwest. The results for wheal are sensitive to one observation, the shipments of 
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which the terminal variables were used, the coefficient of the Min¬ 
neapolis dummy variable was larger. When the elevator dummy vari¬ 
ables were used, the coefficient of the Northwest elevator variable was 
larger in all five cases. While these are not independent tests, they do 
suggest that margins tended to be larger when the elevator shipped 
grain to the Minneapolis market or from elevators located in the 
states of the Northwest. 

V. Conclusions 

This paper has advanced a self-interest theory of why some farmers 
opposed futures markets. For the period covered in this paper, the 
cartel theory can better explain why the political opposition was 
greater and why the elevator gross profit margins were higher in the 
states of the Northwest than in other states. The conventional expla¬ 
nation for this opposition emphasizes farmer aversion to speculation 
and gambling, but this explanation is not easily reconciled with the 
evidence. 


Appendix A 

Tax on Futures Bills and Amendments 

Caraway (Arkansas) Bill (1928). —Bill would impose a tax of 50 cents for each 
$100 on contracts on grain and cotton exchanges. Defeated by 24-50 (Con¬ 
gressional Record, 70th Cong., 1st sess., vol. 69, pt. 11, p. 8273). 

Caraway (Arkansas) Bill (1929). —Bill would prevent the sale of cotton and 
grain in futures markets unless the seller of a contract has on hand or the 
prospect of owning the actual cotton or grain. Defeated by 27-51 (Congres¬ 
sional Record, 70th Cong., 2d sess., vol. 70, pt. 6, p. 3433). 

1938. —Bill to repeal the three-cent federal tax on futures transactions. 
Passed by 52-30 (Congressional Record, 75lh Cong., 3d sess., vol. 83, pi. 5, p. 
5038). 


Appendix B 

Definition of Variables 

1 if elevator ships to Minneapolis terminal 
market; 

1 if elevator ships to midwestem terminal 
market; 


spring wheat from Mansfield, S.Dak , to Chicago. The elevatoris) in this town reported 
very low margins. The 1927 report of the Board of Railroad Commissioners shows 
three elevators in this town, of which two were cooperative elevators. So the low mar¬ 
gins for this town may have been reported by one or both cooperative elevators and/or 
by the private elevator competing with the two cooperative elevators. 


MT 

MWT 
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OT 

= 1 if elevator ships to any other terminal mar¬ 


ket; 

NWE 

= 1 if elevator is located in Northwest Territory 


states; 

GPE 

= 1 if elevator is located in other Great Plains 


states; 

OTE 

= 1 if elevator is located in any other state; 

12 

= I if year equals 1912; 

15 

= 1 if year equals 1915; 

20 

= 1 if year equals 1920; 

Wheat 

= 1 if elevator ships wheat; 

Corn 

= 1 if elevator ships corn; 

Barley 

— 1 if elevator ships barley; 

Oats 

- 1 if elevator ships oats; 


Line elevator market share = market share of line elevators in state in which 

elevator is located. 
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Time and Punishment: An Intertemporal 
Model of Crime 


Michael L. Davis 

Southern Methodist University 


If an increase in the rate at which a criminal commits crimes lowers 
the expected time until detection, the income from crime (net of 
expected fines) must be discounted at a rate that varies with the 
crime rate. This paper models the criminal’s choice of the optimal 
crime rate under such conditions. It is shown that, irrespective of 
the criminal's attitude toward risk, an increase in the probability of 
detection is more likely to deter crime than a comparable increase in 
penalties. Other implications of the model for the optimal enforce¬ 
ment of laws are also explored. 


I. Introduction 

The fruits of illegal activity are no doubt made sweeter by the fact that 
they can be savored before the costs of their acquisition must be paid. 
An economics professor, for example, must obtain his credentials and 
establish a reputation before enjoying the income and prestige that 
come from his position. A bank robber, on the other hand, gets the 
money first and goes to jail later. 

If one models criminal behavior as a problem of labor supply, as I 
do in this paper, the timing of rewards and punishment should not, 
by itself, complicate the analysis since all future costs and benefits can 
be discounted to present value. It is tempting, therefore, to ignore the 
intertemporal aspects of crime and take as the starting point for any 
analysis the effect of the criminal’s behavior on his wealth. To do so. 


1 wish to thank Josef Hadar and an anonymous referee for several helpful com¬ 
ments. 
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however, ignores the fact that the uncertainty of punishment affects 
not only the expected income from crime but also the length of time 
in which a lawbreaker can expect to earn that stream of income. As 
such, the expected income from crime must be discounted at a higher 
rate than the rate of discount for income from legal endeavors and, 
more crucially, at a rate that will vary with the crime rate. 

Explicit consideration of this fact advances the analysis of crime on 
several fronts. First, it suggests that differing propensities to commit 
crime can be explained by the attitude of the agents toward the fu¬ 
ture. Second, it permits a more precise analysis of the effect of 
changes in penalties and the level of enforcement effort, yielding 
predictions that are in some sense different from those obtained from 
a static model. Finally, the model presented here can be extended in 
ways that static models cannot be, in order to examine a number of 
issues relevant to the enforcement of laws. 

In this paper I present a simple model of criminal behavior that 
considers the timing of punishments. I assume that the lawbreaker 
faces a two-period planning horizon: earning income from crime dur¬ 
ing the first period, paying a penalty at the start of the second period, 
and earning only legal income thereafter. The time at which the 
penalty must be paid is uncertain, with a distribution that is affected 
by the seriousness of the crimes committed. 

To place my paper in the context of the existing literature, it is 
useful to refer to Heineke’s (1978) categorization of economic models 
of crime as either (1) portfolio problems, in which the agent must 
decide how much wealth to put at risk through involvement in crime, 
or (2) labor supply problems, in which the agent must choose the 
amount of time to be allocated to illegal activity. As Heineke points 
out, examples of the portfolio approach are found in Allingham and 
Sandmo (1972), Kolm (1973), and Singh (1973). Examples of the 
labor supply problem include Becker (1968), Ehrlich (1970, 1973), 
Sjoquist (1973), and Block and Heineke (1975). This paper fits more 
comfortably in the second category since the agent is assumed to 
select the crime rate that maximizes expected wealth. 

Section II presents the basic model, a brief discussion of the major 
assumptions, and an analysis of the optimal crime rate. Section III 
examines the major implications of the model and suggests some 
possible extensions. 

II. The Model 

The wages of sin may be high or low, but to an economically mo¬ 
tivated lawbreaker what really matters is expected wealth. In order to 
measure the contribution of crime to wealth, imagine an agent con- 
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templating illegal activity that, if undetected, will yield an income rate 
of U(o), where 0 is the rate at which offenses are committed.* Assume 
that some nonnegative offense rate maximizes U and that (/' < 0. To 
give some illustrations, U might represent the profit function of a firm 
subject to pollution control laws or price controls and 0 might repre¬ 
sent the rate of pollution or the price charged in excess of the ceiling. 

The agent views the future as being split into two segments: the 
period before the violation is detected and punished and the period 
after detection. Without loss of generality, assume that if the agent 
engages in illegal activity, then during the first period he earns no 
income from legal activity (this is analytically equivalent to measuring 
all rates of income and the fine in terms of deviations from the income 
generated by legal activity during this first period). Thus during the 
first period the income rate is simply U(o). 

After detection, all further violations are effectively prohibited and 
the agent earns income at the rate Y from some legal activity. At the 
time of detection, the lawbreaker is required to pay a fine of F. This is, 
of course, a greatly simplified vision of the impact of detection. Both 
recidivism and fines that vary with the level of offenses can be in¬ 
cluded in the model without affecting the results discussed below, 
albeit at the expense of more cumbersome notation and some addi¬ 
tional complexity. 2 

The agent does not know exactly when detection will occur but does 
have an opinion about the distribution of the time of detection. Thus 
he can calculate the expected present value of the fine as Ffogify^dt, 
where g(t) is the distribution of the time of detection and r is the 
discount rate. The probability that detection will have occurred by 
some time t in the future, and hence that the lawbreaker will be 
earning Y at time /, is given by the cumulative density function (cdf), 
G(t). Over an infinite horizon, then, the expected present value of 
future income from legal and illegal activity less the expected present 
value of the fine is given by 

V(o) - [ {(/(o)[l - G(t)] + YG(t) - Fg(t)}t~ r1 dt. (1) 

The assumption of an infinite planning horizon is clearly appropri¬ 
ate for analyzing a firm with transferable property rights. For exam- 

1 In expressing the magnitude of lawbreaking with a single continuous variable, I am 
essentially assuming that there is only one type of crime and that it can be commined at 
various rates. For example, o might be the number of illegal aliens hired per week. If 
different types of crimes are possible—say hiring illegals and paying them sub- 
nunimum wages—then it is still possible to aggregate all crimes by a single variable if 
they are committed in fixed proportions. 

! In an earlier version of this paper, which is available on request, I examine several 
applications not presented here, including die situation in which fines vary with the 
offense rate. 
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ining an individual engaged in criminal activity, the assumption is 
questionable (e.g. f an armed robber cannot usually sell the right to 
rob liquor stores to an associate or give it to his children). At any rate, 
a finite-horizon model, although somewhat more complex, would 
yield the same basic results. 

In order to describe the relation between the level of the violation 
and the timing of detection, consider the probability of being caught 
within some small interval of time t after having evaded the au¬ 
thorities up until that time (i.e., the hazard rate). This probability is 
obtained from the conditional density g(f)/[ 1 - G(t)]. Assume that this 
probability is an increasing function of both 0 and E, the rate of 
enforcement (perhaps as measured by the budget of the authorities), 
is convex in 0 , but is otherwise independent of time. That is, assume 
that G(t) is a function such that 

,2 > 

with P m P E > 0, P m , > 0. 

Excluding time as a direct argument of P(-) is equivalent to assum¬ 
ing that the chances of being caught at any moment depend only on 
the offense rate at that point in time. Such an assumption is consistent 
with a world in which the authorities have no institutional memory or 
are faced with some other constraint that prevents them from allocat¬ 
ing enforcement resources on the basis of past offenses. The assump¬ 
tion is also appropriate for examining crimes in which all evidence 
linking the perpetrator to the offense disappears after some short 
interval (some computer crimes and the mysterious disappearance of 
Jimmy Hoffa suggest themselves as candidates for this category). 
While relaxing this assumption does not invalidate the central point 
of this paper, it does add technical complexity to the model. 

The objective of the lawbreaker is now well defined: maximize ex¬ 
pected wealth, (1), subject to (2). While in the most general sense this 
is an exercise in optimal control, the analysis is simplified by the fact 
that time is assumed not to affect the hazard rate and that the plan¬ 
ning horizon is infinite. Thus it is easily shown that the optimal 0 is 
constant over time. 3 

When 0 is constant over time, (2) becomes a linear differential equa¬ 
tion that has as its solution G(f) = 1 - e~ Pl \ because G(t) is a cdf with a 
lower bound of zero, G(0) = 0. Substituting this into (1), integrating, 

* The model is technically identical to the limit-pricing model presented by Kamien 
and Schwartz (1981, pp. 206-8), which includes an excellent discussion of the analytical 
details. 
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and rearranging terms reveals that the objective of the agent is to 
select 0 so as to maximize 


v( 0 e) = m-y-p(°'E)F + l 

L) r + P(o, E) + T- 


(3) 


When the constant term, K/r, is ignored, equation (3) highlights the 
central point of this paper. The numerator is simply the expected 
income from crime, the factor on which most of the labor supply 
models of crime focus (see, e.g., Becker 1968; Block, Nold, and Sidak 
1981). The denominator is the rate at which this flow should be dis¬ 
counted in order to obtain expected wealth. What (3) demonstrates is 
that this effective discount rate is*the sum of the agent’s usual time 
preference plus the probability of being caught. In other words, the 
unique risks associated with illegal activity affect both the rate of 
expected income and the effective discount rate. Since the crime rate 
determines the level of risk, a lawbreaker controls both the expected 
income from crime and the rate at which income is discounted to yield 
wealth. 

The optimal offense rate satisfies the first-order condition 


U„ = P„F + 


*•(■ 


U - Y - P ■ F 
r + P 


)■ 


(4) 


where the left-hand side can be thought of as the marginal benefit of 
crime and the right-hand side gives the marginal cost (the assump¬ 
tions already made guarantee that V w < 0). What (3) makes clear is 
that the marginal cost of crime obtains from two sources. The first 
term on the right-hand side shows that an increase in the offense rate 
raises the expected fine and hence lowers the expected income from 
crime. The second term shows that an increase in the offense rate also 
raises the effective discount rate and hence lowers the value of the 
income stream generated by crime. 

In order to examine conditions under which all crimes are de¬ 
terred, let Y* be the income available to the agent who never engages 
in crime, and assume that there is no chance of being falsely prose¬ 
cuted. The optimal crime rate will be zero if 

U(o) - Y - P(o, E)F Y* -Y 

r + P{o, E) r ' ' ' 

where 0 is the solution to (4). 

In many circumstances, at least, it seems reasonable to suppose that 
criminal activity will foreclose some opportunities to earn legal in¬ 
come and hence that K* > Y. For example, a baby food manufacturer 
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found selling tainted products cannot expect the same degree of 
product loyalty after having been fined by the Food and Drug Ad¬ 
ministration. A young man found guilty of homicide, in some states at 
least, is unlikely to be admitted to the Bar. This assumption is not 
critical to the model, although, as I show below, it does generate some 
interesting results. 

III. Implications and Extensions 

The model presented here advances the analysis of crime in at least 
three ways. First, through the usual methods of comparative statics, 
the model predicts that agents with higher discount rates will be likely 
to commit crime. That is, variation in the amount of crime committed 
by different agents can be explained by differing attitudes toward the 
future. This contrasts with most static economic models of crime that 
offer as their only explicit explanation of the variance in crime rates 
differences in the costs or benefits of crime. Thus, when confronted 
with evidence of two agents who, while experiencing the same con¬ 
straints (i.e., the same U(-) — Y, P(-), and F), make different choices, 
the static model can explain this only as some unmeasured difference 
in preferences. This model allows, at least in principle, a measure¬ 
ment of these differences. 

This is, of course, not to say that including the discount rate in 
empirical studies of crime will always be simple. When one is dealing 
with crimes committed by individuals, especially for nonpecuniary 
gain, it may be rather difficult to measure the individual’s rate of time 
preference. However, when one is analyzing economic crimes, espe¬ 
cially those committed by firms, adequate proxies for the true dis¬ 
count rate will almost always be available. 

The second insight made available by this formulation deals with 
the relative merits of fines and enforcement effort in deterring crime. 
There has apparendy been much speculation in both the popular and 
professional literature that criminals are more deterred by swift and 
certain punishment than by heavy penalties that are loosely enforced. 
Becker (1968, p. 178) suggested that if this is true, it implies that 
criminals are risk preferrers. While the question needs to be asked 
with a great deal of care, it appears as if such behavior may be exhib¬ 
ited by a much broader class of criminals than those who prefer risk. 
To see this, perform Becker’s suggested experiment and let an in¬ 
crease in the probability of detection be "compensated" by an equal 
percentage reduction in the fine so as not to change the expected 
income from crime. From (5) it is clear that even though the expected 
income from crime has not changed, the value of criminal activity has 
been reduced since, as P goes up, this stream is discounted at a higher 
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effective rate. At some point the agent will find it more profitable to 
eschew ail crime. 4 

A third advantage of a model that explicitly considers the timing of 
rewards and punishments is that it can be easily extended to consider 
a range of issues best accounted for in an intertemporal context. 
There are many such examples. The impact of judicial delay can be 
assessed through a straightforward modification of the model dis¬ 
cussed above. Although I have described an agent with a two-part 
planning horizon, recidivism can be analyzed by using the same basic 
structure. Finally, it is possible to analyze the impact of programs, 
such as prisoner schooling and counseling, that raise the offender’s 
income after punishment. 

Although it is certainly possible to object to specific assumptions 
that 1 have made in this model, I can see no advantages that obtain 
from ignoring the timing of rewards and punishments. All the pre¬ 
dictions obtained from the static model are available here. With this 
formulation it is possible to evaluate the impact of fines, enforcement 
effort, 6 and opportunity costs on the level of crime. It is also possible 
to incorporate the lawbreaker’s reactions into the enforcement au¬ 
thority’s utility function in order to analyze the government’s behav¬ 
ior. Finally, this model, like the static formulation, can serve as the 
basis for a normative analysis of crime (e.g., to determine the optimal 
level of crime). 


IV. Summary 

The timing of rewards and punishments is an important factor in the 
economic model of crime that has heretofore been ignored. In many 
instances the benefits of crime are enjoyed for some time before the 
punishment is received. The purpose of this paper has been to pre¬ 
sent and analyze a model in which punishments follow crime at some 
uncertain time in the future. It is shown that this requires the income 


4 This is not to say that a compensated increase in the probability of detection will also 
reduce the rate of offenses committed by those who are not completely deterred. The 
response of this group will depend on how the marginal probability. P„. is changed. 

| Not surprisingly, it can be shown that while such programs lower recidivism by 
raising the opportunity cost of repeat offenses, they raise the level of first offenses. A 
model similar to the one described here allows a precise consideration of the trade-offs 
involved. 

Interestingly, it can be shown that as E goes up, an individual may commit more 
crimes. This contrasts with the stalk model, whkh predicts that enforcement effort 
always reduces the crime rate. The result, which holds even when fines are assumed to 
var y the offense rate, seems to be consistent with Samuel Johnson's maxim “when 
a man knows he is to be hanged in a fortnight it concentrates his mind wonderfully.’' 
That is, when an offender knows that prosecution is certain, he tries to maximize 
income before the inevitable day of reckoning. 



39<> JOURNAL OF POLITICAL ECONOMY 

from crime to be discounted at a higher rate than the income from 
legitimate activity. Explicit consideration of timing alters and extends 
the usual economic model of crime in several regards. Among other 
things, it can be formally demonstrated that the propensity of an 
individual to break the law will depend on his attitude toward the 
future. 
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Accounting for Changes in Tastes 
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Health concerns are thought by many to have shifted consumption 
away from red meats, though econometric evidence is mixed. Test¬ 
ing for structural change is difficult, especially when one time series 
is used for both estimating demand equations and testing their sta¬ 
bility. Specification errors may suggest a shift where none has oc¬ 
curred. Using nonparametric demand analysis, we find that meat 
consumption patterns in the United States and Australia can be 
explained using only relative prices and expenditures. Only impos¬ 
ing particular functional forms can reverse the conclusion, suggest¬ 
ing that specification errors in econometric demand studies can ac¬ 
count for findings of taste changes. 


It is a fallacy, of course, to say that there’s no accounting 
for taste. [Pierre Franey, The New York Times 60-Minule 
Gourmet ] 


I. Introduction 

A common procedure in empirical demand analysis is to estimate a 
system of demand equations defined only over market prices and 
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Varian for his program for nonparametric demand analysis and to Laura Blanciforti, 
Will Martin, Darrell Porter, and Mike Wohlgenant for the use of their data sets. Gian- 
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quantities and then to examine the compatibility of the results with 
utility maximization. In many cases, prices and expenditure seem 
inadequate in explaining observed patterns of consumption. Quite 
often, the rejection of symmetry or homogeneity restrictions, the 
presence of apparent dynamic influences on consumption, or 
significant unexplained trends in the data lead to the conclusion that 
“tastes” have changed over time. 

The arguments of Stigler and Becker (1977) suggest a different 
interpretation of results of this type, and a different response. They 
proposed that it is more useful, and valid, to treat individual prefer¬ 
ences as constant and to seek an economic explanation for any ob¬ 
served changes in demands for market goods. At the level of the 
individual, such changes may arise from changes in the shadow prices 
of household resources or changes in the household technology used 
to transform market goods into fundamental goods. Alternatively, 
even when individual demands are constant, per capita demands at, 
say, a national level could change as a consequence of demographic 
changes or changes in the distribution of income. 

With the Stigler and Becker interpretation of demand theory, a 
finding of structural change in a model of per capita demands for 
market goods should not be surprising. It could result, for instance, 
from inappropriate aggregation of individual data, the exclusion of 
relevant variables such as the opportunity cost of time, or some other 
specification error. 

As well as having implications for the validity of econometric work, 
shifts in market demands are of direct interest to industry and policy¬ 
makers. The current concern over declining consumption of red 
meats provides an excellent example. There is at present a great 
emphasis on altering the nature of products and promotions to offset 
the effects of taste change in the meat industry. That it remains to be 
established whether the demand for red meat has, in fact, changed 
provided the initial stimulus for this study. 

This paper focuses on a nonparametric method for testing whether 
market demands have shifted because of a change in tastes. The 
advantage of the approach is that it gives a test for stable preferences 
for market goods that does not require that they be of a particular 
form, such as the translog. We apply the technique to the demand for 
red meats using time-series data from the United States and Aus¬ 
tralia, testing for whether market prices and expenditures provide a 
complete explanation of consumption patterns. 

The paper proceeds as follows. Section II contrasts two approaches 
to testing for structural change in empirical demand studies. Section 
III discusses the case of the demand for red meats, and Section IV 
addresses some caveats related to necessary aggregation and separa- 
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bility assumptions. The nonparametric method is described in Section 
V, and Sections VI and VII describe results and some of the issues 
surrounding tests for stable preferences. A summary of results, Sec¬ 
tion VIII, concludes the paper. 

II. Parametric and Nonparametric Approaches to 
Structural Change 

Searches for structural change in commodity demands can take two 
forms. Each one amounts to a check for the compatibility of data with 
some stable set of well-behaved preferences. The first approach is the 
more familiar one. A functional form is chosen for demand curves 
and the parameters are estimated. Tests for the stability of parameter 
estimates, statistical significance of trends, autocorrelation of resid¬ 
uals, or other diagnostics are then used to detect structural changes in 
the estimated system. 1 All such tests are conditional on the functional 
form’s being correct; it is uncommon to see several alternatives exam¬ 
ined, and it would not be desirable to attempt such a specification 
search without some systematic approach. 

A definitive test would involve only the hypothesis that preferences 
are stable, not that they are stable and of a particular form. Such a test 
would produce either a yes or no answer to the question, “Is it possi¬ 
ble to explain this data set with some demand system?” A non¬ 
parametric alternative to the traditional approach makes this possible 
while avoiding the problem of having to estimate all possible demand 
systems. It uses the results of revealed preference analysis, established 
by Samuelson (1938) and Houthakker (1950) and more recently ad¬ 
vanced in papers by Afriat (1967) and Varian (1982, 1983). 

As in the parametric approach, the null hypothesis is that there is a 
stable set of preferences so that variation in observed quantities con¬ 
sumed can be explained by changes in relative prices or expenditures. 
When consumers obey the strong axiom of revealed preference, there 
is a stable demand system that fully explains observed consumption 
patterns. This holds because the strong axiom is equivalent to the 
existence of a well-behaved utility function. The axiom need not hold 
in the data when structural changes occur, so a test for violations of 
the axiom is capable of identifying changes in preferences. In addi¬ 
tion, the approach avoids the specification bias likely with arbitrarily 
selected functional forms or the pretesting inherent in specification 
searches. 

Previous applications of the nonparametric technique include stud- 

1 This explains the use of the term “parametric” for this approach. For an analogous 
approach to the measurement of technical change, see Bemdt and Khaied (1979). 
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ies by Landsburg (1981), Varian (1982, 1983, 1984, 1985), Hansen 
and Sienknecht (1985), Swofford and Whitney (1986), and Thurman 
(1987).* Two of these applications have been concerned with testing 
for stability of preferences over time. In his analysis of the demand 
for meat in the United States, Thurman (1987) found that 25 years of 
annual data were consistent with revealed preference theory and sta¬ 
ble preferences. He noted that there has been steady growth in real 
expenditures and suggested that the method might have low power 
when used as a test for stable preferences. This observation was also 
made by Landsburg (1981) for several data sets from the United 
Kingdom and by Varian (1982). When budget lines shift steadily out¬ 
ward over time so that they rarely cross, there will be little chance of 
finding observations inconsistent with the axioms. Each year's con¬ 
sumption bundle is revealed to be preferred to all previous ones. 
Thurman argued that the finding was therefore necessary, but not 
sufficient, to conclude that preferences had remained stable. 

In the sections that follow, we test for the stability of per capita 
demands for meat in both the United States and Australia using the 
nonparametric approach, and we examine the power of the test for 
stability under alternative assumptions. In the next section, we discuss 
the background for the study and review some of the necessary cave¬ 
ats when aggregated data are used in demand analysis. 

III. The Context of the Study 

Agricultural economists and other observers in both countries have 
paid increasing attention in recent years to the question of structural 
change in the demand for meat products, especially beef. 9 Per capita 

2 Varian (1982, 1983) discussed the application to consumer expenditure data and 
later extended the methods to tests for cost minimization by producers (Varian 1984) 
and to formal statistical testing for violations of the axioms (Varian 1985). Diewert and 
Parkan (1985) showed how tests for the separability of a subgroup of commodities 
could be performed using the nonparametric approach. Hansen and Sienknecht (1985) 
used the nonparametric approach as a preliminary test of expenditure data, which they 
used to compare a variety of demand systems. Finding that the data were consistent 
with revealed preference ensured that H was possible to fit demand systems to the data 
and that poor performance of a particular form would not be attributable to having 
used a data set inconsistent with demand theory. Finally, Swofford and Whitney (1986) 
checked for the consistency of observed quantities and user costs of several monetary 
assets to verify that a demand system with restrictions from utility theory could be used 
to model the demand for money and money substitutes. 

* Studies on several aspects of the demand for meats in Australia and the United 
State* include Christensen and Manser (1976), Main, Reynolds, and White (1976), 
Richardson (1976), Fisher (1979), Pope, Green, and Eales (1980), Haidacher et al. 
(1982), Nyankori and Miller (1982), Braschler (1983), Chavas (1983), Haidacher 
(1983), Moachini and Meilke (1984), Murray (1984), Martin and Porter (1985), Menk- 
haus, St. Clair, and Hailingbye (1985), Wohlgenant (1985), Alston and Chalfant (1987), 
Chalfant (1987), and Thurman (1987). 
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consumption of red meat has declined in recent years, and it is widely 
suggested in industry circles that this reflects a shift of consumer 
preferences due to dietary concerns. The notion that preferences 
have shifted dramatically has been used to justify attempts to shift 
them back through promotion, the development of new grading sys¬ 
tems, and other product innovations. 

Although these investments may well have a positive rate of return, 
independent of whether any taste change has occurred, finding stabil¬ 
ity of preferences would suggest, as Wohlgenant (1985) concluded, 
that production research might remain a better investment than at¬ 
tempting to shift consumer preferences that are in fact rather stable. 
In any event, a necessary first step in evaluating the rate of return to 
such activities is accounting for the various influences behind shifting 
consumption patterns. We used time-series data sets of per capita 
consumption and prices to examine the stability of preferences 
among meats in the United States and Australia. Given the paucity of 
evidence of a longitudinal nature, this is the typical sort of data set 
used in testing for structural changes. 

To study Australian meat demand, we used Martin and Porter’s 
(1985) data set for five meats: beef, chicken, lamb, mutton, and pork. 
These data include quarterly domestic disappearance for the five 
meats during the period 1962:1 to 1984:4. The data were not season¬ 
ally adjusted. It seems to be a harmless assumption to ignore most 
other commodities in the analysis, but omitting fish consumption data 
may bias results if fish substitutes for the included goods. Unfortu¬ 
nately, for Australia we do not have a consistent series for fish con¬ 
sumption or a fish price that can be incorporated into the analysis. We 
also used two data sets for the United States that have been studied 
previously. The annual data set (1947-78) reported by Blanciforti 
(1982) was used to examine the demand for beef, chicken, pork, and 
fish, and a similar data set that extends to 1983, from Wohlgenant 
(1985), was also tried. 4 

There are some interesting differences in consumption patterns 
and in the data between the two countries. Lamb and mutton are 


* As always when lime-series data are used in food demand analyses, the available 
quantity data are not ideal. Testing for changes in tastes requires measures of quantities 
actually consumed and corresponding prices. Prices were available at the retail level for 
both countries. In both of the U.S. data sets, quantities were taken from the Food 
Consumption, Prices, and Expenditures reports issued by the Department of Agriculture. 
In those, retail quantities were estimated from wholesale data; no actual consumption 
data were available (Bunch 1987). This introduces possible measurement errors, to the 
extent that disappearance and consumption differ substantially. For Australia, only 
wholesale quantity data were available. If the proportion of wholesale to retail weights 
«the same for all meats through time, this is not a problem, only a redefinition of units 
of measurement. To the extent that dressing percentages do vary, potential bias is 
introduced into empirical analysis. 
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important in Australia but are relatively unimportant in U.S. con¬ 
sumption. In both countries, per capita chicken consumption has ris¬ 
en considerably over time, while consumption of red meats has de¬ 
clined. In Australia, consumption of beef, mutton, and lamb has 
declined, while the same is true of beef in the United States. 

These trends in per capita quantities consumed are not inconsistent 
with relative price patterns. Increased consumption of chicken and 
reduced consumption of beef in both Australia and the United States 
may be explained by the fact that in both countries real chicken prices 
have fallen and real beef prices have risen. On the other hand, it is 
difficult to account for the broad changes in consumption patterns 
with increased health consciousness alone. Any widespread move by 
consumers away from red meats would be expected to affect pork in 
the same way in both countries, yet we observe pork consumption 
rising in Australia while it is relatively constant in the United States. 
Pork prices have diverged somewhat in the two countries, and that 
could account for the divergent movements of pork consumption. 

On the whole, then, the behavior of meat consumption patterns is 
not grossly inconsistent with stable market demands, given the direc¬ 
tions of movements of relative prices and incomes. Still, changes in 
preferences may have contributed to the observed changes in pat¬ 
terns of meat consumption, and many people have taken the trends in 
the data as evidence of structural change, primarily, a substitution of 
chicken for beef in both countries beyond what would have resulted 
from changes in prices and expenditures alone. 


IV. Theoretical Restrictions Applied to the 
Demand Models 

Considerable attention has been paid in the literature to whether it is 
justified to expect theoretical propositions that relate to the individual 
to hold for an aggregate or representative consumer, especially over a 
subset of goods. It is quite possible to reject the theoretical restrictions 
on the demand system solely because of biases induced by aggrega¬ 
tion or by erroneous separability assumptions. That possibility would 
presumably carry over to tests of stable preferences. Even when indi¬ 
vidual preferences are stable, per capita demands could appear unsta¬ 
ble because of aggregation bias or as a result of excluding a relevant 
good. 

In practice, the analysis of aggregated data for consumption of a 
subset of goods can proceed in two ways. One approach is to abandon 
contact with the underlying theory, not invoking the restrictions of 
homogeneity, symmetry, and so on. A belief in aggregation or separa- 
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bility bias would justify this but would leave absolutely no guidance as 
to the nature of the demand curve that should be estimated. 

The second approach, the preferred one, if its prevalence in the 
literature is any guide, b to model the data as having been generated 
by a representative consumer maximizing a stable utility function 
over the goods under study, subject to a constraint on expenditures. 
Separability assumptions are dictated by the difficulties in modeling a 
large number of goods, even if data limitations do not preclude a 
broader study. 

In this study, we chose to require the existence of a representative 
consumer as part of our test for stability of preferences. In other 
words, we tested the stability of demands that are consistent with a set 
of underlying preferences, as opposed to ad hoc demand equations 
that do not satisfy the restrictions from consumer theory. Further¬ 
more, we adopted the assumption that meats constitute a weakly 
separable group in each country. Rejection of the stable preferences 
hypothesis may be due to aggregation bias rather than shifts in indi¬ 
vidual consumption patterns or to omitted goods; however, if instabil¬ 
ity is not found, it seems safe to conclude that these are not significant 
problems. 

An additional concern relates to the nature of the alternative hy¬ 
pothesis in structural change studies. Whether a parametric or non- 
parametric approach is used, there is often a lack of data about the 
nature of the structural change, and the alternative hypothesis is usu¬ 
ally no more specific than that the null hypothesis is incorrect. It 
would be preferable to incorporate directly the determinants and the 
nature of any hypothesized structural change in a more specific alter¬ 
native hypothesis. For example, proxies for increased health con¬ 
sciousness could be included. However, problems arise because of the 
number of possible influences, the lack of adequate data, and the 
uncertainty about the manner in which they might affect demands. 

V. Nonparunetric Analysis of Structural Change 
in the Demand for Red Meats 

To test the stability of the demand for red meats, we ask whether 
relative prices and expenditures on meat can provide a complete 
explanation of meat consumption patterns. The null hypothesis in 
these tests is that observed data conform to the restrictions implied by 
a stable set of well-behaved preferences, under the necessary addi¬ 
tional restrictions that meat constitutes a weakly separable group and 
that it is appropriate to analyze per capita consumption data as having 
been generated by maximization of a utility function by a representa¬ 
tive consumer. In other words, all that is necessary for the null hy- 
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pothesis to be correct is that we plot the data and superimpose on our 
plots a set of indifference surfaces and isocost planes consistent with 
the rational purchase of those quantities by a consumer facing ob¬ 
served relative prices. 

A time series of prices and quantities can therefore be checked for 
consistency with this hypothesis using revealed preference axioms. 
We examine the data using both the weak axiom and the strong 
axiom. The weak axiom is a convenient means to illustrate the 
method, but the strong axiom is required to verify that the data are 
consistent with utility maximization. 

According to the weak axiom of revealed preference, a bundle a is 
revealed preferred to any other bundle b (denoted aRb) that could 
have been purchased instead (i.e., a is preferred to all points within 
the budget line that applies when a is purchased). The weak axiom is 
violated if any such bundle b is also revealed preferred to bundle a 
(i.e., a lies inside the budget line that applies when b is purchased). 
Such a result implies that both aRb and bRa, which could occur only if 
indifference curves had shifted, given our maintained hypotheses. 

For each data point a, let P„ be the price vector and Q„ the quantity 
vector, each with length equal to N, the number of goods. 5 The cost of 
purchasing bundle a is then P a Q a . A time series including prices and 
quantities of N goods can be examined for consistency with the weak 
axiom by forming a matrix 4> with typical element 4> o6 = P^Q* so that 
each element 4> a * gives the cost at time a prices of purchasing the 
bundle of goods consumed at time b, as would enter the calculation of 
a cost-of-living index. For instance, the elements in each column give 
the cost at various price vectors of obtaining the consumption bundle 
b, while the elements in any particular row allow comparisons of 
the cost of the various bundles at a fixed set of prices. 

If actual expenditures at time a exceed the cost of bundle b at time a 
prices, so that 4> M > 4> a *, then aRb. Violation of the weak axiom occurs 
if it is also true that bundle a was affordable at time b, so that 4> w > 4>*a 
and bRa. When well-behaved, weakly separable per capita demands 
are kept as a maintained hypothesis, any such violation of the axioms 
of revealed preference must be interpreted as evidence of a change in 
preferences between time a and time b. 

The absence of any violations is consistent with stable preferences. 
It is worth noting as well that this finding would confirm that it is 
permissible to impose symmetry and homogeneity on the demand 


s For convenience, we use a single letter of the alphabet when the meaning is clear to 
denote both a particular observation and the bundle of goods consumed at that point, 
so that aRb and Q*RQ» 3re equivalent. 
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system used to explain these data; that is, those restrictions are not 
inconsistent with the data. 6 

Finding that the data are consistent with the weak axiom does not 
rule out the problem of intransitivity. It is also necessary to check for 
consistency with the strong axiom. This involves a search for in¬ 
transitivity in the data, to see if bundles a, b, and c can be found that 
together imply that aRb, bRc, and cRa. The number of bundles of 
goods that can come between a and c is limited only by the size of the 
data set. The data are consistent with the strong axiom if no in¬ 
transitivities are found in the matrix 4>. 

When no violations are found, it is possible to “rationalize" the data, 
to use Varian’s (1982) term. The data set can then be said to have 
been generated by the maximization of a utility function by a repre¬ 
sentative consumer. In the event that some of the observations are 
inconsistent with the axioms, it is possible to test whether the devia¬ 
tions from utility maximization are significant ones. Diewert and Par- 
kan (1985), Epstein and Yatchew (1985), and Varian (1985) each 
discussed ways to test for the statistical significance of such violations. 

It is possible to think of each observed quantity consumed as being 
made up of two components, the true quantity and a measurement 
error. It is the true quantities that should be tested for consistency 
with utility maximization, not those measured with some error. The 
question then becomes whether there can be constructed a set of 
“small” measurement errors, to be subtracted from the observed 
quantities in such a way as to satisfy the strong axiom. Varian shows 
how comparison of the variance implied by the constructed measure¬ 
ment errors to a hypothesized value permits significance levels to be 
assigned to the deviations from the strong axiom. 7 


VI. Results Using U.S. and Australian Data 

Using the Australian data set, we applied the weak criterion to the 
quarterly data used by Martin and Porter (1985) for the period 


6 Whether that will be true with a particular parametric demand system is another 
question, of course. Two papers that examine the validity of those restrictions in a 
parametric framework are Christensen, Jorgenson, and Lau (1975) using the translog 
and Deaton and Mueilbauer (1980) using the almost-ideal demand system. A series of 
papers beginning with Laitinen (1978) have explored the statistical reasons for the 
tendency to reject these properties in practice. 

As pointed out by a referee, one could also treat prices as random, or both prices 
and quantities. The nonlinear programming problem solved to find the nearest data set 
satisfying the strong axiom is altered by changing the source of measurement errors, 
but the interpretation is otherwise identical. 
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1967:J-1984:4. 8 We constructed the matrix 4> described in the previ¬ 
ous section, of dimension 72 x 72, so that the first row gives the costs 
of buying the 72 different bundles at 1967:1 prices, the second row 
the costs of the same bundles at 1967:2 prices, and so on. 

It is easy to characterize $ under the null hypothesis: no violations 
of the strong axiom will be found. All observed choices are consistent 
with maximizadon of the same udiity funcdon. 

Suppose that the alternauve hypothesis is true, and there has been a 
substitution of chicken for beef due to health concerns or other fac¬ 
tors not related to prices and expenditures. The hypothesis of stable 
preferences could be rejected if consumpdon patterns evolved along 
the following lines. First, a bundle (or bundles) purchased early in the 
sample was affordable later in the sample but was rejected in favor of 
a bundle with more chicken and less beef. This implies a preference 
for the bundle with more chicken, if this latter bundle was affordable 
earlier but was not chosen, we would conclude that, over time, prefer¬ 
ences had shifted to chicken and away from beef. That is the only way 
to reconcile these events with rational choice and is also the only way 
to reject the stable preferences hypothesis using the nonparametric 
test; such a reversal of choices must be found. 

We found no such switch of preferences. The data are almost en¬ 
tirely consistent with the hypothesis that relative prices and the level 
of expenditures completely explain shifts in consumption patterns. 
We found only two minor violations of the weak axiom of revealed 
preference, as shown in table 1. 

In practice, it is convenient to form ratios of elements in <1», which 
turn out to be quantity indexes. We computed these by dividing every 
element in by the diagonal element in the same row, forming the 
ratios for all i,j combinations. These indicate whether the bun¬ 
dle of time j quantities was affordable at time i prices. Thus a ratio less 
than unity indicates that iRj. The size of the index can be used to 
judge the importance of any violation that occurs when jRi also oc¬ 
curs. 

In table 1, these ratios are shown for the two pairs of data points 
found in violation of the weak axiom. These violations occur with the 
pair formed by time i = 1971:3 and time j = 1972:2 and with the one 
formed by time i = 1973:2 and time j = 1982:4. At time 1971:3 
prices and expenditures, the bundle consumed in 1972:2 was afford¬ 
able, though just barely. The ratio of 0.99993 indicates that the 

latter period’s bundle was just 0.007 percent cheaper than what was 


8 For the first 5 years of Martin and Porter's data (1962:1-1966:4), quarterly values 
for chicken consumption were imputed from annual quantities, so we deleted those 
data points (see Fisher 1979). 
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actually consumed (and revealed preferred). When the cost of the 
1971:3 bundle is computed at 1972:2 prices and compared with that 
period’s expenditures, the earlier bundle is revealed inferior, costing 
almost 0.08 percent less. This is shown by the ratio = 0.99924. 
Similarly, the comparisons for time i — 1973:2 and time j * 1982:4 
show that the cost of these bundles was virtually the same at the prices 
that were observed in either period. These comparisons show viola¬ 
tions of the weak axiom, but negligible ones. Independent of the 
statistical significance of any of these violations, the deviations from 
one do not occur in significant digits, given the precision of measure¬ 
ment of quantities consumed. 

Moreover, we found that with minor adjustments in the values for 
mutton consumption in two periods, 1971:3 and 1973:2, these viola¬ 
tions of the weak axiom are no longer present. We selected mutton as 
the culprit for a number of reasons, in part because the observed 
value for 1971:3 was unusually high (near the maximum of the entire 
series) and that for 1973:2 was somewhat low, in comparison with 
adjacent sample values. Any of the other commodities could have 
been used. As shown in table 1, reducing 1971:3 mutton consumption 
by 1 percent, from 5.37 kilograms to 5.316 kilograms, and raising 
1973:2 mutton consumption from 2.41 kilograms to 2.55 kilograms 
was sufficient to remove the violation of the weak axiom. The extent 
of the adjustments necessary to restore the data set to consistency with 
utility maximization is certainly plausible as a correction for measure¬ 
ment error. In any case, the small corrections needed are not of the 
sort that would lead to the conclusion that preferences are shifting 
away from red meats. They involve only mutton. 

Mutton consumption was a minor component of meat expenditures 
and declined steadily throughout the sample, to the point that mut¬ 
ton’s share was never above 5 percent of meat expenditures after 
1975:1. Also, Beggs (1987) noted that in later quarters mutton prices 
were inferred from the lamb price, providing an additional reason to 
suspect that the mutton data are less reliable than the data for other 
meats. Consequently, we omitted the mutton data and tested for sta¬ 
bility of the demand for the four other meats, equivalent to assuming 
them to be separable from mutton. We also aggregated the values for 
mutton and lamb consumption and tested that data set. No violations 
were found in either case. As noted by Epstein and Yatchew (1985, 
p. 155), a statistical procedure is not needed with such an outcome. 

With the original data set, we also applied the more general test for 
violations of the strong axiom but found no further cases beyond the 
two observations inconsistent with the weak axiom in the actual data. 9 

9 We thank Hal Varian for providing us with his programs for nonparametric de¬ 
mand analysis, which were used for checking consistency with the strong axiom. 
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Again, with mutton excluded or aggregated with lamb consumption, 
there were no violations. These findings support the hypothesis that 
observed consumption patterns in the Australian data set are consis¬ 
tent with a stable set of demands. Even if the measurement error 
interpretation is unacceptable and instead one questions the series for 
mutton consumption, there do not appear to have been shifts away 
from lamb and beef toward chicken. 

Using the U.S. data for 1947-78 from Blanciforti (1982), we ob¬ 
served the same outcome: no violations of revealed preference were 
discovered. An identical result was obtained with the data used by 
Wohlgenant (1985), which include the years 1947-83. This repre¬ 
sents an additional 5 years of more recent data and takes the sample 
well beyond the dates when most observers consider structural 
changes to have occurred. 

The revealed preference results indicate that these data sets are 
consistent with maximization of a stable utility function. A complete 
explanation of changes in consumption patterns over time can be 
given using prices and expenditures. Whatever biases have been in¬ 
duced by the separability assumption or by imposing restrictions from 
the theory of the individual consumer with per capita data, they are 
not of a form that causes us to reject the stability of a well-behaved set 
of demands defined over the included goods. Only if the hypothesis 
of stable preferences is replaced with one that requires stable prefer¬ 
ences of a particular functional form can rejection occur. 


VII. The Interpretation and Power of the 
Nonparametric Teat 

One drawback to testing hypotheses with the nonparametric ap¬ 
proach appears to be the unknown power of the test. Is the failure to 
reject the hypothesis of stable preferences a strong indication of the 
absence of structural change, or could the data be consistent with 
revealed preference even in the presence of substantial structural 
change? 

The power issue relates to observations made by Landsburg (1981) 
and Varian (1982) that when the nonparametric method is applied to 
a 88 re gate consumption series, one is unlikely to find a violation of 
revealed preference axioms because the budget lines drawn for an¬ 
nual observations rarely cross. Aggregate consumption of every good 
is rising through time in such applications, so each bundle of goods is 
revealed preferred to all previous ones. Concerns over power there¬ 
fore relate not so much to the method as to the data set under study. 
If growth in retd expenditure dwarf* variation in relative prices, it is 
likely that less will be revealed about substitution relationships among 
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goods than if real expenditure remained relatively constant. This will 
be true using either parametric or nonparametric methods. 

We examined this problem using our data sets. It seems reasonable 
to expect the power of nonparametric tests to be higher for disag¬ 
gregated goods such as these than for more aggregated goods. Quan¬ 
tities consumed do not all rise uniformly with time, and relative price 
variation is likely to be greater, relative to variation in real expendi¬ 
ture, than for more aggregated bundles of goods. However, this is 
much more true of the Australian data than the U.S. data, mainly 
because the U.S. data are annual and cover a longer time period. 
Between observations and across the sample, the growth in real ex¬ 
penditure is greater in the U.S. data. 

1'he matrix 4> used to perform the tests illustrates this notion of 
power. Recall that the data were arranged chronologically, so that the 
first row relates to period 1 prices, the second to period 2 prices, and 
so on, while the columns involve observed quantities in the same way. 
When real expenditures rise through time, it is likely that the cost at 
any time t of buying bundles purchased earlier in the sample will be 
less than observed expenditures at time t. Similarly, the cost of bun¬ 
dles purchased later in the sample, when measured at time t prices, is 
likely to exceed actual expenditures at time t. The extreme case occurs 
when all numbers below the diagonal in 4>, call them 4>, ; (i <j), are less 
than the diagonal elements for any time period; and, at the same 
time, all elements above the diagonal, (l > j), exceed 4^. 

To indicate the importance of this type of problem, for each data 
set we checked the number of occurrences of 4>, ; < 4>„ for entries 
above and below the diagonal. The results are given in table 1. 

In the Australian data, there are 1,088 instances in which observed 
bundles were affordable at price and expenditure levels observed 
earlier in the data set. While this is less than the 1,345 observed below 
the diagonal, it is not so small that concerns over power need be great. 
On the other hand, with Wohlgenant’s data set, there are 73 above the 
diagonal and 581 below. This reflects the greater growth in real ex¬ 
penditures for meats over the longer time period covered by these 
data. Real expenditures were smallest in the sample in 1951, reaching 
their peak in 1976 at a level over 40 percent higher. They declined 
slightly and almost monotonically from there to 1983. A similar pat¬ 
tern exists in Blanciforti’s data. 

Even when the power is low in nonparametric tests, relative to some 
desired level, it does not follow that the parametric approach should 
be adopted. There are two reasons for this. First, the same concerns 
over the nature of the data set would affect tests that require the 
estimation of demand systems. In a parametric analysis, it seems likely 
that when Engel curves explain more of the variation in quantities 
consumed, less is revealed about substitution between goods, just as 
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with the nonparametric approach. Second, when a particular form 
for the demand system is estimated, the test for stable preferences is 
replaced by a test for stable preferences of the translog, aimost-ideal, 
or some other form. It is well known that an arbitrarily chosen de¬ 
mand system may perform very badly as an approximation of the 
mechanism by which the data were generated (e.g., Deaton 1986). 
Rejection of hypotheses such as stability or homogeneity could then 
be due to use of the wrong functional form rather than a rejection of 
the economic proposition. 

This problem and the concern over growth in real expenditures in 
time-series data sets led us to consider an alternative procedure. It is 
easy to see that the power of our test for stable preferences depends 
on the variation in relative prices being large relative to that of expen¬ 
ditures, though it is hard to quantify the power relationship. Monte 
Carlo simulations might be formulated to do so under a variety of 
conditions, but care must be taken in applying such results to data 
sets that might not follow the mechanisms used to generate the 
simulations. 

Another procedure that might be used to increase the power of 
nonparametric procedures involves using the data set under study 
rather than simulated data. It suffers from the same criticism as the 
parametric approach: it is arbitrary and requires the use of a joint 
hypothesis, but we find it considerably more intuitive. 

The problem with the parametric approach is that there would 
have to be a search over an infinite number of functional forms to 
“prove” that stable preferences do not exist; every structural change 
test with a particular functional form can only indicate instability of 
that form. Picking a functional form is a means of imposing prior 
beliefs on a demand system in an attempt to glean more information 
from the data set. However, we would expect that no one has any 
priors about functional forms, apart from a requirement that they be 
compatible with economic theory and a preference that they be some¬ 
what parsimonious. 

An alternative way of imposing prior information on the analysis is 
to perform the equivalent of “detrending” the data by making an 
adjustment for real expenditures. This involves imposing prior be¬ 
liefs about income elasticities, which seems more natural than impos¬ 
ing priors about functional form. Income elasticities are easily linked 
to past experience with similar data sets and can be connected with 
the Bayesian approach to inference with much less difficulty than can 
the problem of selecting the functional form. 10 

To see if our conclusions concerning the stability of preferences 

10 Sec Rossi (1985) for an example of selection of functional form using Bayesian 
techniques. 
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were a consequence of real income growth alone, we imposed prior 
beliefs about income elasticities as follows. First, each data point was 
adjusted for changes in the level of real expenditures (actual expendi¬ 
tures on meats divided by the consumer price index), on the basis of 
an assumed expenditure elasticity of one. We replaced the observed 
quantity of good i at time t with an adjusted value, calculated us- 
ing ~ Q tl x (1 - DINC t ), where DINC t is the percentage by which 
real expenditures at time t exceed the minimum for the sample. 

After this adjustment to reduce variations in the level of expendi¬ 
tures, the constructed quantities should lie mostly on budget lines that 
cross. The remaining variation in quantities consumed was then 
tested for consistency with revealed preference. To a certain extent, 
this violates the spirit and intent of the approach; additional assump¬ 
tions are introduced and become joint hypotheses that are being 
tested. However, this might be preferable to the application of the test 
to unadjusted data, depending on one’s view. It is certainly preferable 
to imposing an arbitrary functional form or searching over alternative 
specifications. 

Our results are shown in table 1. If alternative income elasticities 
are considered, it would be possible to report intervals for each elas¬ 
ticity within which there is no significant violation of revealed prefer¬ 
ence. The user of such information would be free to choose whether 
the constructed intervals are reasonable as a description of reality. 
The same exercise could be performed using price elasticities if the 
constant elasticity assumption is acceptable. 

The results continue to support the stable preferences hypothesis. 
Even after the quantities consumed were adjusted by the level of 
expenditures, they were consistent with the strong axiom of revealed 
preference. Variations in constructed quantities, after adjustment for 
changes in the level of real expenditures, are consistent with maximi¬ 
zation of a stable utility function. 

VIII. Summary and Conclusions 

This paper has used a nonparametric approach to test for structural 
change in the demand for meat, applied to the United States and 
Australia. A check for consistency with the axioms of revealed prefer¬ 
ence showed that the data from both countries could have been gen¬ 
erated by stable preferences. Therefore, any conclusions from these 
data sets that tastes have changed must come in the form of restric¬ 
tions on die nature of these demand systems (e.g., to be of the almost- 
ideal form). The data alone do not indicate changes in preferences. 
Relative prices, instead, can account for the observed shifts in con¬ 
sumption patterns. 



CHANGES IN TASTES 


4°7 

There is some uncertainty about the power of such a test. Power is 
likely to be low for most data sets consisting of aggregate consumption 
since each year’s consumption bundle will be revealed preferred to 
that of the previous year. For disaggregated goods such as meats in 
the data sets we examined—where relative prices have varied consid¬ 
erably while real expenditure on the meats group varied less—the test 
seems likely to have sufficient power. Unfortunately, while this notion 
of power is analogous to the usual statistical definition, it is harder to 
quantify. 

Our investigations of data constructed from the observed quantities 
using a procedure that adjusts for growth in real expenditures sup¬ 
ported the findings with unadjusted data. We imposed restrictions on 
expenditure elasticities, corrected for expenditure effects, and found 
that variation in relative prices continued to explain the remaining 
shifts in consumption patterns. 

It will be of interest to attempt to estimate a demand system and 
elasticities consistent with the stable preferences indicated by the re¬ 
vealed preference results. Experimentation with alternative func¬ 
tional forms would be required, and the drawbacks with that have 
been discussed. Several authors have questioned the approximation 
potential of models such as the translog or almost-ideal. It may be that 
testing for structural change as Wohlgenant (1985) did, using the 
variable number of parameters of the Fourier flexible form, will avoid 
misspecification implicit in more restrictive forms. That approach 
falls between the fixed-parameter forms such as the almost-ideal and 
the completely nonparametric approach using revealed preference. 

The assumptions concerning weak separability and aggregation 
over consumers can be challenged. However, our nonparametric re¬ 
sults do support the stability of well-behaved separable per capita 
demands for beef, chicken, lamb, mutton, and pork in Australia and 
for beef, fish, pork, and poultry in the United States. On this basis, we 
interpret past work as rejecting the stability of particular functional 
forms for those demands rather than as an indication of structural 
changes in demand. 

We began the paper by distinguishing taste changes reflected in 
commodity demands from those discussed by Stigler and Becker 
defined over fundamental goods. To most people, there seem to be 
fairly compelling reasons to believe that consumer preferences for 
meat have changed. It may be the case that household production 
functions have shifted over time and that meat is being perceived and 
used differently by consumers. Yet when conventional demand the- 
' or y * s applied to market demands, we find that we are unable to reject 
- stability of a set of preferences defined over market quantities of 
, meat items. 
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Stigler and Becker appealed to economists to augment their models 
to account for changes in demand. Our results indicate that, in the 
case of meat, there are not any changes to be accounted for. More¬ 
over, we found nearly identical results in two countries where the 
conventional wisdom is that preferences have changed in important 
ways in recent years. This duplication of results adds force to our 
conclusion that we can account for the conventional wisdom: the 
models that indicated structural change did so because of the chosen 
functional form. 
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Compensating Differentials and 
Seif-Selection: An Application to Lawyers 


John H. Goddeeris 

Michigan Stale University 


This paper models individual choice between two types of jobs as 
dependent on the difference in potential earnings and on prefer¬ 
ences for nonpecuniary compensation. The model leads to simul¬ 
taneous estimation of earnings and job choice functions in a manner 
that takes account of self-selection of individuals into the sector of 
highest utility. An application to lawyers choosing between private 
and “public-interest” law casts doubt on the notion that public- 
interest lawyers are accepting substantially lower earnings by virtue 
of their choice—an impression derived from estimation of earnings 
functions without accounting for self-selection. The estimation tech¬ 
nique also takes proper account of the “choice-based” nature of the 
sample, which appears to be important. 


The notion that individual choices among jobs are influenced by more 
than just monetary considerations has an obvious intuitive appeal and 
has begun to draw increasing attention in theoretical and empirical 
economic research. 1 As is well known, differences in the non¬ 
pecuniary characteristics of jobs can give rise to differences in wages 
among equally productive workers since particularly desirable non- 
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wage aspects compensate for lower pay. To date most empirical work 
on compensating differentials has assumed, at least implicitly, that 
workers are identical in their preferences for wages and non* 
pecuniary job characteristics and that selection among types of jobs is 
not related to unmeasured determinants of productivity. 2 The former 
assumption seems contrary to everyday experience, and the latter is at 
least questionable. To improve our understanding of nonpecuniary 
compensation, it is therefore desirable to consider models that relax 
both assumptions. 

This paper develops and estimates a model of choice between two 
types of jobs, where an individual’s choice depends on the difference 
in potential earnings and on personal preferences for nonwage as¬ 
pects of the jobs. Potential earnings in the two jobs are estimated 
simultaneously with the choice in a manner that accounts for the 
possibility of self-selection based on unmeasured determinants of 
earnings. The paper also serves as an application of estimation tech¬ 
niques appropriate for choice-based samples (Manski and Lerman 
1977; Cosslett 1981; Manski and McFadden 1981) to a simultaneous 
model with continuous and discrete dependent variables. 3 The model 
is applied to a sample of lawyers in the private and “public-interest” 
sectors, extending the work of Weisbrod (1983). The results suggest 
that preferences for public-interest work are related to personal char¬ 
acteristics. More striking, apparent large wage sacrifices made by 
public-interest lawyers—which show up in earnings functions esti¬ 
mated by least squares—disappear when the complete model is esti¬ 
mated. Actual wage sacrifices (or rents received) cannot, however, be 
estimated with precision when appropriate techniques are used. 

The paper is structured as follows. Section I discusses the model 
and the nature of the empirical application, including data sources. 
Section II presents the results and provides some evidence on their 
robustness to changes in model specification. A brief summary in 
Section III concludes the paper. 


a A typical approach is to estimate by least squares an earnings function of the form y 
» a'x + bd + e, where y is earnings, x is a vector of productivity-related personal 
characteristics, d is a dummy variable indicating the presence of a particular non¬ 
pecuniary job characteristic, and e is a disturbance that includes unmeasured influences 
on earnings. The assumption that there is a single value for b —the compensating value 
of the job characteristic—implies identical preferences, and the use of least squares to 
estimate it assumes that the choice between jobs having and not having the characteris¬ 
tic is uncorrelated with c. 

* The only other published application I am aware of is Poirier (1981), which involves 
a similar model but does not estimate it jointly. The choice of regime equation is 
estimated using an appropriate method for a choice-based sample, and the results are 
then used to correct the other equations for possible selection bias. 
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I. Model Specification 

A. Background 

Weisbrod (1983) has compared lawyers employed in public-interest 
law firms with those employed in the private sector. He concludes that 
public-interest lawyers receive much lower wages than their counter¬ 
parts of similar characteristics in the private sector and that the likeli¬ 
hood of choosing the public-interest sector depends (negatively) on 
the size of the sacrifice in wages required and on certain background 
characteristics of the individual lawyer. These findings suggest that 
lawyers choosing public-interest law are deliberately sacrificing sub¬ 
stantial pecuniary gain for nonpecuniary rewards and that their pref¬ 
erences between the two types of compensation are systematically 
different from the preferences of those employed in the private 
sector. 

While these findings are not implausible, the empirical methods 
employed may be questioned for two reasons. First, the fact that those 
lawyers who choose one sector rather than another are not a random 
sample of the population may create a problem for the estimation of 
earnings functions. It is possible, for example, that those who choose 
public-interest law would have earned less in the private sector than 
actual private-sector lawyers with the same measured characteristics, 
that they know it, and that this in fact partly explains their choices. 
Comparison of public-interest lawyers’ wages with wages of private- 
sector lawyers with similar measured characteristics would in that case 
tend to overstate actual wage sacrifices. Weisbrod’s technique—which 
involves least-squares estimation of earnings functions for each sector 
separately—does not deal with this potential selection bias problem. 
Second, in the choice of sector equation he does estimate, the sample 
is treated as though it were randomly drawn from the population of 
lawyers. In fact the public-interest sector is a highly unusual choice (so 
that even a large random sample would likely contain very few public- 
interest lawyers), and in the data set he uses, these lawyers are heavily 
oversampled. With this sample design, techniques appropriate for the 
case of random sampling generally yield inconsistent estimates, and a 
method designed for a choice-based sample is needed. 4 This paper 

4 In a sense this criticism is too strong. While the statement in the text is true, it has 
been shown by McFadden (and reported in Manski and Lemtan [1977]) that for the 
special case of a logit model that includes a constant term (or, in the more general 
polychotomous choice case, a set of alternative specific dummies), the random sampling 
maximum-likelihood estimator consistently estimates all but die constant term in a 
choice-based sample. This is in fact Weisbrod's case. Thus if the assumption of no self¬ 
selection on the basis of unmeasured determinants of earnings is valid and the logit 
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describes and estimates a model similar to Weisbrod’s, making use of 
the same survey oflawyers, but modified to take account of the limita¬ 
tions mentioned above. 

B. The Model 

The problem dealt with here is formally similar to that of evaluating 
the impact of any intervention or treatment. If subjects with a particu¬ 
lar set of observable characteristics are randomly assigned to treat¬ 
ment and control groups, then the difference in average outcome 
between the two groups consistently estimates the expected treatment 
effect for a randomly chosen individual with those characteristics. If, 
however, the effect varies across subjects because of unobservables 
and the allocation to groups is related to its size, this method will not 
consistently estimate an expected effect for a random individual. This 
is a familiar problem of sample selection bias (see Heckman and Robb 
[1985] for a general discussion of the estimation of treatment effects 
in the presence of selection decisions). 

Here the “treatment” is the choice of one job rather than another, 
and the impact I wish to evaluate is its effect on earnings. I have no 
reason to expect this effect to be identical across individuals and, if it 
does vary, every reason to expect the choice of job to be related to its 
size. Indeed, in a simple deterministic model of earnings maximiza¬ 
tion, the sign of the differential would dictate the choice. Therefore, a 
model that incorporates the possibility of such selection bias is desir¬ 
able. 

The economic sense of the model adopted is straightforward. An 
individual chooses a job (sector of employment) to maximize utility. 
The utility of a job depends on its associated earnings and non- 
pecuniary characteristics. Within a sector, potential earnings for an 
individual depend on measured personal characteristics (traditional 
human capital variables) and other things (e.g., unmeasured ability or 
random noise). Preferences for nonpecuniary aspects of jobs vary 
across individuals, and these preferences are correlated with back¬ 
ground characteristics. 

These assumptions are captured formally in the following model. 
For an individual t, utility u 7 , in sector j is given by 

■ Tyji + «/'*» + > ■ o» i. (D 


model of choice is appropriate, his estimates are consistent (for all but the constant). 
The simplicity of the logit case is lost, however, in the present model because of the 
additional endogenous variables. A probit specification is used here primarily for con¬ 
venience in specifying the joint distribution of the disturbances. Some evidence on the 
suitability of this specification u discussed below. 
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where y# represents (the log of) potential earnings for individual i in 
sector j, z, is a vector of background characteristics related to non- 
pecuniary job preferences (all vectors are column vectors). <fcy, repre¬ 
sents other influences on utility, and y and a, are parameters. 

An individual’s choice of sector is determined by the sign of the 
utility difference S* and is denoted by a categorical variable S,: 

S* = til, - tio, = yCu - yoi) + (<*!- ao)'z. + <{>,. 

_ |0 if S* 0 (2) 

ll ifS*>0, 

where 4>, = 4> i • _ d>o«- 

Potential earnings in each sectfir are given by 

yji = 0'Xi + 8 j + e,„ (3) 

where x, is a vector of human capital characteristics, 0 and 8 are 
parameters, and e,, represents unmeasured influences on potential 
earnings in sector j for individual i. The e’s are assumed to have zero 
mean conditional on personal characteristics in the population. The 
parameter 8, representing the expected difference in log earnings in 
the two sectors for a randomly chosen individual, is of particular 
interest here. 

Because earnings are observed in only one sector for each individ¬ 
ual, (2) is not useful for estimation purposes as written. But substitut¬ 
ing using (3) yields 

S* = y8 + («j - «o)'z, + y(eu - e 0 .) + <|>„ (4) 

which expresses Sf as a function of observed characteristics and unob¬ 
served disturbances. Rewriting (3) as a single function for actual earn¬ 
ings makes it evident that the problem of sample selection bias arises 
naturally from the economics of the model: 


>, = 0’*i + BS, + [(e Jt - eo,)S, + e 0 ], (5) 

where y, represents an individual’s observed earnings. We might 
imagine estimating this function by, for example, least squares. But 
from (4) an individual’s choice of sector (value of Si) depends on the 
difference in the e’s, which also shows up (multiplied by S,j in the 
disturbance of (5). Least squares applied to (5) can therefore not be 
expected to estimate 8 consistendy. 

To rectify the problem, die estimation procedure must account for 
the endogeneity of the choice ofaector. It is dear from inspection of 
(4) and (5) that things could be simplified by assuming that the differ- 
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ential in potential earnings is the same for all individuals, so that tot = 
ei,. In that case 8 can be estimated consistently from (5) by instrumen- 
tai variables, using z as instruments for S,, without invoking further 
assumptions. In the spirit of the present model, however, it is impor¬ 
tant that these unmeasured influences on earnings be allowed to dif¬ 
fer across sectors, and therefore additional structure must be imposed 
(Heckman and Robb 1985). Assume that in the population, condi¬ 
tional on personal characteristics x and z, the e's are bivariate normal 
with zero means and that 4> is normal and uncorrelated with them. 

With these assumptions, and 8 are identified in the model defined 
by (3) and (4). As is typical in discrete choice analysis, *y and (a, - ao) 
are identified only up to a positive multiplicative constant. A normali¬ 
zation is therefore necessary in the choice of sector function, and o| 
= 1 is chosen for convenience. The likelihood function appropriate 
for this model in the case of random sampling is discussed in the 
Appendix. 

The sample used in the estimation below is not, however, purely 
random. It may be interpreted as a choice-based sample (Manski and 
Lerman 1977), in which the sectors are sampled in some proportions 
(different from their actual proportions in the population), and indi¬ 
viduals are chosen randomly from the population within each sector. 
Maximization of the likelihood function appropriate for a random 
sample yields inconsistent estimates if applied to this case. When the 
true population proportions of individuals in the two sectors are 
known, however, maximization of a simple modification of that likeli¬ 
hood function leads to a consistent and asymptotically normal es¬ 
timator. The Manski-Lerman weighted exogenous sampling max¬ 
imum-likelihood (WESML) estimator may be adapted to this case, as 
described in the Appendix (see also Heckman and Robb 1985). 

C. Data 

The data used are from the same survey of lawyers analyzed by Weis- 
brod and collected between August 1973 and May 1974 by research¬ 
ers at the University of Wisconsin Law School and the Institute for 
Research on Poverty (for further details on the survey, see Handler, 
Hollingsworth, and Erlanger [1978], esp. app. A). The survey in¬ 
cludes a national random sample of lawyers 5 and several subsamples 
of lawyers in particular types of work. I define the two sectors some¬ 
what differently than Weisbrod did. The public-interest sector is 
defined here to include not only lawyers in private nonprofit law firms 

5 Handler et al. (1978) report that younger lawyer* are over sampled. No account of 
that it taken in the estimation here. 
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“engaged in class-oriented activities of a left, reformist sort” (Weis- 
brod 1983, p. 250) but also lawyers employed in the Legal Services 
Corporation, a government agency formed as a part of the War on 
Poverty to provide legal aid and engage in law reform activities for 
the poor. 6 The non-public-interest sector (referred to as “private 
law”) is also defined more broadly here to include all other types of 
law jobs, not just employment in private law firms. 7 

The sample analyzed includes all practicing lawyers in the random 
sample who classify themselves as employed full-time at one job and 
for whom complete data are available on the variables used in the 
analysis and all other full-time, one-job public-interest lawyers for 
whom complete data are available. 8 This yields a sample size of 843 
from the random sample, seven of whom are public-interest lawyers, 
and an additional 361 public-interest lawyers. The value used for the 
population proportion of lawyers in the public-interest sector is 
0.00714. 9 To test for sensitivity to possible error in this number, the 
basic model was also estimated with the population proportion ad¬ 
justed by 10 percent in each direction, with little effect on the esti¬ 
mated parameters. 

The dependent variable in the earnings function is the natural log 
of 1973 earnings (earnings are measured in thousands of dollars). No 
information on hours of work for full-time lawyers is available in the 
data set. The variables used to explain earnings—that is. the x’s in 
(3)—are: EXPR: experience, defined as 1973 minus year of law 
school graduation; EXPRSQ: experience squared; LSQUAL: quality 
of law school, as rated by a panel of legal experts on a 1-6 scale, 1 
being highest quality (note that the expected sign of the coefficient is 

0 The Legal Services lawyers are quantitatively more important than the other public- 
interest lawyers, by a ratio of about 6 to 1. More information on Legal Services can be 
found in Handler et al. (1978). 

7 A practical reason for defining the sectors differently than Weisbrod is to increase 
sample sizes. Since the model is one of dichotomous choice, it also seems important to 
divide law jobs exhaustively into two categories thought, a priori, to be relatively homo¬ 
geneous in terms of nonpecuniary characteristics. 

"This additional sample is treated as a random sample from the population of 
public-interest lawyers. In the WESML estimation, all public-interest law observations 
are given the same weight, regardless of whether they entered as part of the random 
sample of all lawyers. This is consistent with formula (1.54) in Manski and McFadden 
(1981, p. 29). 

This was calculated using estimates of the total number of lawyers and judges in the 
United States at the time of the survey, reported in the U.S. Statistical Abstract, 1982, 
and of the number of Legal Services and other public-interest lawyers from Handler et 
al. The seven public-interest lawyers found in a sample of 843 represent a fraction of 
0.0083, very consistent with the assumed population proportion. None of the public- 
interest lawyers found in the random sample are from nongovernment public-interest 
•aw firms (all are Legal Services lawyers), but this is not very surprising. If the propor¬ 
tion of such lawyers in the population is 9.0012 (consistent with my assumptions), the 
probability of finding zero in a sample of size 843 is .36. 
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TABLE 1 

Means and Standard Deviations of Variables 


Variable 

Private Law 
(N « 836) 

Public-Interest Law 
(N = 368) 

Mean 

SD 

Mean 

SD 

EARNINGS 

32.7 

18.8 

17.0 

5.5 

ln(EARNINGS) 

3.34 

.549 

2.79 

.294 

EXPR 

14.5 

12.5 

9.18 

8.89 

EXPRSQ 

366 

548 

163 

353 

LSQUAL 

3.23 

1.58 

2.82 

1.56 

TOP 

.500 

.500 

.424 

.495 

COMSIZ 

.461 

.499 

.380 

.486 

ACTPOL 

.225 

.418 

.489 

.501 

LEFT 

.361 

.481 

.815 

.389 


therefore negative); TOP: a dummy variable, set equal to 1 if in top 
quartile of law school class, 0 otherwise. 

Preliminary analysis suggested that a heteroscedastic specification 
for the e’s was appropriate. 10 The specification 

o,(t) = + EXPR, 1 ' 2 • a, 2 (6) 

was used, with oji and - 0, 1, parameters to be estimated. The 
correlation p between ei and e 0 was assumed independent of EXPR. 

The z variables from (2), those thought to be related to preferences 
for nonpecuniary aspects of public-interest versus other types of law, 
are: COMSIZ: community size where the lawyer grew up; 1 if small 
town or rural area, 0 otherwise; ACTPOL: 1 if politically active in 
college, 0 otherwise; LEFT: 1 for lawyers classifying themselves as 
liberal, left liberal, or radical, 0 for others. Means and standard de¬ 
viations of the variables for both groups of lawyers are included in 
table 1. 


II. Results 

A. The Basic Model 

Before we turn to the results of estimation of the basic model, it is 
instructive to consider an earnings function like (5) estimated from 
the full data set without regard for issues of self-selection. Such esti¬ 
mates are presented in column 1 of table 2. They take account of 

10 The possibility of heteroscedasticity related to EXPR teemed plausible a priori. 
The particular specification was chosen after analysis of residuals from least-squares 
estimation of the private-law earnings functions. While those residuals may not consis¬ 
tently estimate the ti’s because of self-selection, some support for the adopted 
specification may be found in the fact that all four variance parameters are large 
relative to their standard errors. . 
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TABLE 2 
Results 



No Selection, 

No Weighting 

0) 

Basic Model 

(2) 

No Weighting 

(3) 

CONST 

2.94 

2.74 

2.81 


(.034) 

(.050) 

(.040) 

EXPR* 

.624 

.765 

.660 


(.033) 

(.051) 

(.034) 

EXPRSQ** 

-.125 

-.150 

-.132 


(.008) 

(.012) 

(.008) 

LSQUAL* 

-.374 

-.175 

-.301 


(.067) 

(.101) 

(.067) 

TOP 

.121 

.169 

.123 


(.020) 

(.031) 

(021) 

8 

-.446 

.113 

-.262 


(.022) 

(.187) 

(.041) 



Choice Function 


1 


7.34 

2.24 



(5.45) 

(476) 

CONST 


- 3.58 

-1.35 



(.528) 

(.097) 

COMSIZ 


-.226 

-.154 



(.128) 

(.089) 

ACTPOL 


.297 

.464 



(126) 

(.091) 

LEFT 


.710 

1.17 



(.119) 

(.098) 



Variance Parameters 


<Joi 

.347 

.391 

.304 


(.025) 

(.033) 

(.025) 

0(12* 

.360 

.443 

.542 


(.080) 

(.088) 

(.082) 

<*ll 

.154 

.183 

.169 


(.020) 

(.058) 

(.020) 

012* 

.368 

.516 

.339 


(.081) 

(.091) 

(.074) 


Nttre —Entries are parameter estimates: asymptotic standard errors are in parentheses. 
* Multiply reported estimates and standard errors by .1. 

** Multiply reported estimates and standard errors by 01 


heteroscedasticity in the manner described above—the variance of 
the disturbance term depends on both experience and the choice 
of sector—but assume that, conditional on the sector, the disturbance 
has zero mean. The procedure used is maximum likelihood if these 
assumptions are valid. 11 

As shown in table 1, the mean earnings of public-interest lawyers 

11 Ordinary least squares estimates are very similar. In particular, the value of i is 
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are far lower than those of private lawyers in the sample ($17,000 
compared to $32,700). This difference is attenuated only slighdy if 
one uses the function reported in column 1 of table 2 and compares 
predicted earnings in the two sectors for a lawyer with the mean 
characteristics of public-interest lawyers in the sample. Predicted 
earnings are $25,900 in the private sector and $16,300 in the public- 
interest sector, a difference of 56 percent of estimated public-interest 
earnings. 12 This is larger than the 20 percent difference estimated by 
Weisbrod, and if it did represent the value of nonpecuniary benefits 
to the average public-interest lawyer, a compensating differential this 
large would surely merit further study. When these estimates are 
taken at face value, the differential also appears to be estimated quite 
precisely, the ^-statistic for 8 having a value greater than 20. 

Column 2 of table 2 presents the results of numerical maximization 
of the weighted log-likelihood function for the basic model described 
by (3) and (4), with the asymptotic covariance matrix estimated as 
described in the Appendix. The parameter p was initially permitted 
to vary, but the WESML function was maximized at a value of ~ 1, 
on the boundary of the parameter space. Therefore, the reported 
results involve p set equal to one; that is, it is not treated as an esti¬ 
mated parameter. 13 

The most notable feature of the column 2 estimates is the value of 
8. Far from the negative and apparently highly significant value in 
column 1, the value is positive, suggesting that, on average, potential 
earnings are larger in the public-interest sector for randomly chosen 
lawyers. To be sure, the estimate is quite imprecise (with a ^-statistic of 
about .6), but the contrast to column 1 is nonetheless striking. 

How is a positive value of 8 possible, given that earnings of public- 
interest lawyers are so much lower than the earnings of lawyers in the 
private sector? If the point estimates are interpreted as true parame¬ 
ter values, the story they seem to be telling is as follows. The effect on 
earnings of individual differences not captured in the x variables 
(included here in the e’s) works in the same direction in either sector. 
The effect is simply larger in absolute value in private law (the vari¬ 
ance of e is larger in sector 0 for all realistic values of experience). As 
a result, those lawyers who choose public-interest law come prepon¬ 
derantly from among those with large negative e’s. The reason, as 


’* Here "predicted earnings" were calculated by taking antilogy's + tj) using the 
mean x in the public-interest sample. Given the assumptions of the model, this method 
consistently estimates the median (not mean) of earnings conditional on x for those 
who choose sector j (Goldberger 1968). 

This result occurred also for all other variants of the model that involved this error 
structure. It is admittedly somewhat troubling, but there is no problem of unbound¬ 
edness of the likelihood function, and p *> 1 does have a reasonable economic interpre¬ 
tation, as discussed below. 
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suggested in the seminal paper by Roy (1951), has to do with com¬ 
parative advantage. These lawyers are at an absolute disadvantage 
relative to the average in either sector, but their comparative advan¬ 
tage lies in the sector in which earnings are more concentrated. Be¬ 
cause of this systematic self-selection of negative e lawyers into public- 
interest law, lawyers in that sector tend to have lower earnings than 
those of similar measured characteristics in private law. If the model 
and estimates are believed, however, the apparent large monetary 
sacrifices made by public-interest lawyers are illusory: they disappear 
once self-selection is properly accounted for. 

Turning to the estimates of the parameters of the choice function, 
the value of ^ is positive as expected: the partial effect of an increase 
in earnings in one sector is to increase the probability of choosing that 
sector. However, 7 is not estimated with precision. The three 2 vari¬ 
ables appear to add some explanatory power. The probability of 
choosing public-interest law is lower for those who grew up in small 
communities and higher for those who were active in politics in col¬ 
lege and who classify themselves as liberal or radical. The asymptotic 
{-statistic for the last variable is particularly large. 

Taken together, these variables exert a statistically significant in¬ 
fluence on the choice of sector. A test of the hypothesis that all 
coefficients in the choice function except the constant are zero is 
strongly rejected (Wald statistic value = 49.6; .90 critical value = 
12.0 ). 14 At the same time, the estimated function does not go far 
toward explaining who chooses the public-interest sector in terms of 
measured variables. Calculated at the mean values of the exogenous 
variables for the public-interest lawyers and the column 2 parameters, 
the probability of choosing this sector is approximately .013, which 
compares with the assumed probability of .0071 for a randomly cho¬ 
sen lawyer, and an estimate of about .0044 for a lawyer with the mean 
private lawyer characteristics. According to these estimates, a lawyer 
with the mean public-interest lawyer characteristics is about twice as 
likely to choose public-interest law as a randomly chosen lawyer. At 
the same time, those who choose this kind of law are clearly outliers, 
even among those with the same measured characteristics. 

It appears that those who choose public-interest law are unusual, 
and not only in terms of the unmeasured determinants of earnings. 
This can be seen by calculating the estimated probability of choosing 
public-interest law conditional on the mean public-interest values of 
*, a, and e. ,s That probability is .065. This is substantially higher than 


14 Given the nature of the estimation technique, Wald tests are most convenient for 
tes *s of restrictions on coefficients (Burgtlete, Gallant, and Souza 1982). 

5 This is done by evaluating the term in braces in eq. (A2) in the Appendix at the 
estimated parameters and the mean public-interest values of >*, x, and a. 
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the probability calculated without taking account of the e’s but indi¬ 
cates that public-interest law is still an unusual choice, even after they 
are accounted for. The results are thus not inconsistent with the hy¬ 
pothesis that there are substantial differences in preferences for 
public-interest relative to private law, beyond those captured in the 
measured variables used here. 


B. Robustness 

In assessing empirical results it is natural to ask whether special as¬ 
sumptions involved in obtaining them may be tested for plausibility 
and whether the results are robust to weakening such assumptions. In 
this subsection, I examine Some of the assumptions invoked here, 
particularly as they relate to the failure to find significant negative 
earnings differentials in the public-interest sector. 

One obvious generalization of the basic model would allow the 
effects of the x variables on earnings to vary by sector; that is, it would 
include a separate £ vector for each sector rather than merely an 
intercept shift. In the implied generalization of (4), x would appear in 
the choice of sector function, with constraints on its coefficient vector. 
Such a model was estimated by WESML, and the restrictions on it 
implied by the basic model could not be rejected (Wald statistic value 
* 1.58; .90 critical value = 7.78). This model also provided no evi¬ 
dence of negative earnings differentials in the public-interest sector. 
The difference in the systematic parts of the earnings functions [(Pi 
- p 0 )'x] evaluated at the mean public-interest characteristics was 
close to zero (.010), albeit with a very large standard error (.237). 

Two other “general’’ specifications were explored. Both involved 
sector-specific p vectors. In addition, both entered x in the choice of 
sector function without constraints on its coefficient vector. Doing so 
allows for the possibility that the characteristics that influence poten¬ 
tial earnings are also associated with differences in nonpecuniary 
preferences—that x is included in z. The second of these models also 
treated the disturbance in the choice function in a different way. 
Rather than including the difference in the e’s directly in this func¬ 
tion, I simply assumed that the disturbances in the earnings equations 
were correlated with that in the choice function, and two correlation 
coefficients were estimated. This may be interpreted as relaxing the 
assumption that the individual knows exactly the difference in poten¬ 
tial earnings at the time the choice is made. 

As might be anticipated, standard errors for the parameter esti¬ 
mates tended to be larger in these less restrictive models. The princi¬ 
pal results were, however, largely unaffected. The basic model is a 
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special case of the first of these general ones, 16 and the constraints 
that it imposes could not be rejected (Wald statistic value = 9.30; .90 
critical value = 13.36). The hypothesis that fl] * 9o could not be 
rejected in either model, and neither provided any evidence of nega¬ 
tive earnings differentials in the public-interest sector. 

All the WESML estimators discussed here rely on the assumption 
of normal disturbances. It has been shown in related contexts that the 
properties of estimators can be sensitive to departures from normality 
(Goldberger 1980; Arabmazar and Schmidt 1982; Paarsch 1984), so 
the suitability of this assumption may be a concern. To get some 
evidence on this point, x 2 goodness-of-fit tests (Heckman 1984) were 
performed on the estimated earnings equations. These tests compare 
the actual distribution of earnings in the sample with that predicted 
by the model. The test results did not reject normality in the basic 
model or in the model augmented to allow sector-specific P’s. 17 

A final set of estimates of some interest are those reported in col¬ 
umn 3 of table 2. The estimation techniques employed for the basic 
model and for its more general variants deal simultaneously with (a) 
the self-selection issue and ( b) the choice-based nature of the sample. 
Most of the discussion has emphasized self-selection, which can be 
regarded as the more central issue. Were it not a problem, the tech¬ 
niques used in column 1 of table 2 would be acceptable for estimation 
of the earnings function, despite the nature of the sample. Still, it is 
interesting to explore the importance of dealing appropriately with a 
choice-based sample, given that self-selection exists. The column 3 
results enable us to do so. These were estimated exactly as in the basic- 
model except that all observations were weighted equally in the log- 
likelihood function. Thus the full sample was treated as though it 
were randomly drawn from the population of lawyers. The standard 
errors reported would be appropriate if the sample were randomly 
drawn. 

These results suggest that accounting properly for choice-based 
sampling is indeed important. The column 3 earnings function more 
nearly resembles that of column 1 than column 2, with the estimate of 
8 negative and rather large. There are, however, some differences 
from column. 1: 8 is less than two-thirds as large in absolute value, and 
the value of ji'x at the mean private lawyer characteristics is, at 3.24, 
well below the sample mean of log earnings for private lawyers of 
3-34. According to these estimates, lawyers who choose the private 

16 It is not a special case of the second model because that model imposed a constant 
variance on the disturbance in the choice function. 

7 The test statistic calculated for the full sample and the bask model is 8.28, and that 
for the augmented model is 8.24. The relevant .90 critical value with seven degrees of 
freedom is 12.02. * 
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sector tend to be individuals whose potential earnings there are rela¬ 
tively high, given their measured characteristics, and therefore their 
earnings overstate what a randomly chosen lawyer could expect. But 
the importance of this self-selection into the private sector is greatly 
exaggerated since, by failing to account for the oversampling of 
public-interest lawyers, these estimates implicitly treat the public- 
interest choke as far more common than it really is. The coefficients 
on the z variables are of the same sign and order of magnitude as 
those in the estimates of the basic model, but the effects of these 
variables on the probability of choosing public-interest law are esti¬ 
mated to be much larger in column 3. The reason is again that this 
approach incorrectly treats public-interest law as a relatively common 
choice. 

C. Discussion 

At a minimum, the results presented here indicate that taking proper 
account of self-selection and of the process by which the sample is 
generated can make an important difference in estimating models of 
this type. The fact that the focus is on a set of individuals who make a 
very unusual choice makes dealing with selection issues all the more 
crucial. Those lawyers who choose public-interest law must be outliers 
in some respect; the possibility that they are outliers in unmeasured 
determinants of earnings should not be assumed away. Nonetheless, 
some limitations of the adopted approach should be acknowledged. 

The general model of sector choice in equations (1)—(3) is admit¬ 
tedly somewhat simplistic. It assumes that sector choke is based only 
on current income and nonpecuniary factors, thus ruling out, for ex¬ 
ample, the possibility that job decisions by lawyers may include an 
investment component; sacrificing current for future income. The 
assumption of a sharp dichotomy between two sectors, with all jobs 
within each sector identical in terms of nonpecuniary characteristics, 
may also be questioned. It seems likely that some jobs in the private 
sector are more similar to public-interest jobs than others (in working 
conditions, types of clients served, and so forth), so that the choice is 
over a more nearly continuous range of alternatives rather than a 
dichotomous one. 18 

The set of exogenous variables used is relatively small, creating the 

18 It should be obvious that the model also says nothing about the nature of the 
differences in nonpecuniary aspects of the different types of jobs. The differences need 
not be only in the nature of the work done but might also include differences in 
working conditions, in work hours (given that they have not been measured), in level ot 
effort required, or in other things. There is some discussion of this point in Wetsbrod 
( 1983 ). 
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usual concerns that correlations between important omitted factors 
and included variables may bias the estimated parameters away from 
their theoretical counterparts. As always, the range of possible choices 
for exogenous variables is limited by the information available in the 
data set. But in addition, calculation of the WESML estimators (and 
their asymptotic covariance matrices) in these models is computation¬ 
ally difficult, and the difficulty (computer time required) rises rapidly 
with the number of parameters to be estimated. Very little analysis of 
the sensitivity of results to changes in specification beyond that re¬ 
ported here has been performed. 

Finally, behavior on the employer side of the market has not been 
modeled explicitly. As already noted, however, the results suggest 
that employers in the public-interest sector reward the measured x 
characteristics in the same way that firms in the private sector do but 
that more difficult to measure characteristics that influence earnings 
in the private sector appear to matter less in public-interest law. Such 
a wage policy is consistent with the finding that, other things equal, 
this sector attracts lawyers whose potential earnings in the private 
sector are low compared with others with the same x’s. Why these 
employers should behave in this manner is not explained here. One 
possibility is that the two types of work are truly different, and mar¬ 
ginal productivity simply varies less across individuals in the type of 
work that public-interest lawyers do than it does in private law. In that 
case the resulting wage distributions could be consistent with competi¬ 
tive equilibrium, as Roy's model implies. 

Another hypothesis is that individual differences influence produc¬ 
tivity equally in both sectors but that public-interest employers devote 
less effort to determining the true productivity of employees than 
private firms do. A private law firm behaving in this way would pre¬ 
sumably not survive long in a competitive market for legal services. 
But public-interest law firms do not face the same kinds of market 
constraints that private firms do, and it is not inconsistent with some 
views of government and nonprofit behavior to expect that they 
would devote less effort to minimizing the cost of output. 

III. Conclution 

The role of nonpecuniary compensation in the sorting of workers 
among jobs is an important issue that is only beginning to receive 
empirical study. Systematic sorting related to differences in prefer¬ 
ences may have, implications for, among other things, differences in 
behavior among different types of institutions (e.g., private firms, 
government, and nonprofit organizations). This paper presents and 
estimates a model of choice between two types of jobs, in which the 
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choice is assumed to depend on the difference in potential earnings 
and on personal characteristics related to preferences for the non- 
pecuniary aspects of the jobs. Potential earnings in the two sectors are 
estimated simultaneously in a way that accounts for self-selection. 

The application involves lawyers employed in the private and 
public-interest sectors. Since the public-interest lawyers are over- 
sampled, an estimation technique appropriate for a choice-based sam¬ 
ple is used (an adaptation of the Manski-Lerman WESML estimator). 
Contrary to the impression one gets from estimating an earnings 
function for lawyers without accounting for self-selection, the results 
provide no evidence that those who choose public-interest law are 
generally accepting large sacrifices in earnings by virtue of that 
choice. Personal characteristics are found to be related to the choice 
of sector, suggesting that sorting related to differences in preferences 
does exist, but the characteristics used in this application do not go far 
toward explaining which lawyers choose public-interest law. The re¬ 
sults are subject to a number of qualifications, but they point up the 
potential importance of selection bias, particularly when one is con¬ 
sidering a set of individuals who make a very unusual choice and 
comparing them with a larger population. 


Appendix 

This Appendix describes the WESML estimator and its asymptotic covariance 
matrix for the basic model. For an individual i we observe x j( z„ the choice of 
sector, and y, in the sector chosen. Conditional on x, and z,, with random 
sampling assumed, the likelihood of a particular observation (j - 0 , y (h = y,) 
for an individual in sector 0 is the likelihood of the joint event 

- y, - P'x„ -8, 3* y8 + (at, - atol'z,}, 

where 8 , = y(e,, - e 0 j) + d>>- This is equal to 

/teoi = y, - P’x.) • prob{ — 8, 5* 78 + (a, - o 0 )'E ( |e (h * y, - 0'x,}, 

where /(•) is the marginal density function for eo- 

Under the assumption of normality of the disturbances and random sam¬ 
pling, this leads to a term in log-likelihood function of the form 

-log„ 0 M + 

(Al) 

+ iog/i - c [ ~y 6 + («I - «o)'«. + y[pg,(»)/<ro(») - l](y, - p'x,) |\ 

' I Vl + 7 Z (1 - p 2 )crf(t) *' 

for each individual i in sector 0, where g(-) and G( ) are the standard normal 
probability density function and cumulative distribution function, respec¬ 
tively. This formulation embodies the normalization that tr| = 1 and incorpo¬ 
rates the heteroscedasticity of the e's assumed in Section 1C. That is, <ro(>) 
denotes the standard deviation of eo conditional on person i's experience, or 
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CTo(i) = C701 + (EXPR t ) w a 02 . 

The analogous term to (Al) for individual k in sector 1 is 
-log .,,(*> + 

. / c [ -yS + (tti - 00)% + *y[ 1 - fXTo(k)/(T,(k))(y k - P'x, - S) 

1 Vi T7fT7®j 


(A2) 


With adaptations of the Manski-Lerman results, the WESML estimator 
appropriate for the choice-based sampling case is obtained by weighting each 
term in the log-likelihood function according to the sector from which the 
observation comes. The weight w(j), applied to each observation from sector 
j, is the ratio of the population proportion of individuals in that sector to the 
proportion in the sample. 

Following Manski and Lerman, the asymptotic covariance matrix of the 
WESML estimator can be consistently estimated as follows. Let h denote the 
vector of WESML estimates, W the weighted log-likelihood function at sam¬ 
ple size N, and W, the term in that function for observation i. Then the 
asymptotic covariance matrix of fir is consistently estimated by 
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Pickering's Collected Malthus: 
A Review Article 


Peter Groenewegen 

University of Sydney 


The publication date (1986) of this first collected edition of the works 
of Thomas Robert Malthus (1766-1834) coincides with the ses- 
quicentenary of the second, posthumous edition of his Principles of 
Political Economy by William Pickering of London, the forerunner of 
the publishers of this collection. Its eight volumes' are said to com¬ 
prise all of Malthus’s known published writings. This collection is also 
the first in a series of Pickering Masters that promises “texts reset to 
modern standards, English translations where necessary, scholarly 
introductions and textual notes, and a general index for each author.” 

The Pickering venture virtually coincides with Malthus publishing 
initiatives from the Royal Economic Society and Cambridge Univer¬ 
sity Press. In 1926 the Royal Economic Society and Macmillan re¬ 
printed the first edition of Makhus’s First Essay on the Principle of 
Population, with notes by James Bonar. 2 More than half a century later 
it is publishing a variorum edition of subsequent versions of the Essay 
that appeared between 1803 and 1826 in tandem with a variorum 
edition of Malthus’s Principles of Political Economy. These editions have 


The preparation and revision of this review article have benefited from John Pullen's 
generous assistance and suggestions from George Stigler. here gratefully acknowl¬ 
edged. This review is respectfully dedicated to the memory of Patricia James, the noted 
Malthus scholar, who died suddenly, but peacefully, on March 15, 1987. 1 was in¬ 
formed of her death by her son after writing to her for assistance with a query, which 
unfortunately she herself will now no longer be able to answer. 

The eight volumes include: 1, An Essay on the Principle of Population (1798): 2 and 3. 
An Essay on the Principle of Population (1826); 4, Essays on Population; 5 and 6, Principles of 
Political Economy (1836); 7, Essays on Political Economy; and 8, Definitions in Political 
Economy and a general index. 

Bonar (1926, pp. i-ii) reported that the first edition was a rarity already in 1895 
when William Ashley edited parallel chapters from the first and second editions. Details 
of other reprints of the first edition are given in vol. 1, p. 52. 

cf Pehbtai Eamom), ] 988. vol. 96, no. 2) „ 
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been prepared, respectively, by noted Malthus scholars Patricia James 
and John Pullen. Whether this Malthus publishing double is a fortu¬ 
nate coincidence seems doubtful. However, it provides an almost 
unique opportunity for consumer choice to economists wishing to 
have a representative sample of Malthus’s major work on their book¬ 
shelves. 

The value of such increased accessibility of Maithus’s works for 
contemporary economists perhaps needs some demonstration. Al¬ 
though for close to a century after his death Malthus was ignored as 
an economist for all practical purposes, publication of John Maynard 
Keynes’s General Theory in 1936 sparked considerable revival of inter¬ 
est in his work. In the centenary allocution written after his revised 
1933 Memoir of Malthus, Keynes claimed that Malthus had “a pro¬ 
found economic intuition and an unusual combination of keeping an 
open mind to the shifting period of experience... [and that] Malthus, 
above all, was the great pioneer of the application of a frame of 
formal thinking to the complex confusion of the world of daily 
events” (Keynes 1972, pp. 107-8). Thirty years later, the occasion of 
the bicentenary of Malthus’s birth allowed Robbins (1967, p. 260) to 
remind members of the Royal Economic Society “that Malthus is one 
of the most illustrious of our predecessors.” Reasons included his 
“instinct against rigidity” in thought, his love of truth, and his pro¬ 
found humanity. Such qualities are rare and valuable in economists. 
Hence, irrespective of the degree of acceptance these appreciations of 
Malthus induce, an opportunity to study such qualities at first hand 
provides one strong justification for making Malthus's writings more 
readily available. Historians of thought will find justification enough 
from many new interpretations of Malthus's work that have appeared 
over the last decade and the evidence of a new Malthus revival such as 
the 1984 Paris “Malthus Past and Present” conference at which no 
fewer than 164 papers were presented. 

Readers of this Journal will want to know what benefit the modern 
economist can derive from studying the collected writings of Malthus. 
As the views from Keynes and Robbins already cited suggest, such 
benefits do not come from specific aspects of his doctrine but from the 
perspectives he brought to the study of political economy. In particu¬ 
lar, those seeking explanations for economic events in the modern 
world will gain from closer acquaintance with what has been de¬ 
scribed as Malthus’s “wondrous gift" of “intuition to bring to an ex¬ 
plicit level deep problems of economic life" (Stigler [1953] 1965, 
p. 311). A full appreciation of this major talent requires careful study 
of its application by Malthus, something clearly facilitated by the pub¬ 
lication of these collected works. Some references to samples from 
these collected riches may, therefore, at least illustrate this enduring 
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quality in Malthus’s writings to which Keynes, Robbins, and Stigler 
have pointed. When learning from mistakes is also a useful quality for 
the practicing economist, Makhus’s “great weakness,” or his inability 
to “reason well" and to construct theories fully consistent within them¬ 
selves and the facts of the world (Stigler 1965, p. 311), may be equally 
instructive. 

As a first sample, economists can be directed to Malthus’s introduc¬ 
tion to his Principles of Political Economy (5:5-18) with its useful em¬ 
phasis on the limitations of the science. Malthus’s warning about the 
attraction of simplification and generalization and the dangers of sin¬ 
gle-cause explanations, in combination with his reference to Isaac 
Newton’s “admirable rule” of not admitting more causes than are 
strictly necessary to solving the problem at hand, is a good example of 
its methodological wisdom. Equally useful is the associated “strong 
conviction . . . that the frequent combination of complicated causes, 
the action and reaction of cause and effect on each other, and the 
necessity of limitations and exceptions in a considerable number of 
important propositions, form the main difficulties in the science, and 
occasion those frequent mistakes which it must be allowed are made 
in the prediction of results” (pp. 8-9). Furthermore, Malthus's appeal 
for a need to bring theories “to the test of experience” and the 
difficulties inherent in such a test (pp. 10-11) can still bear repeating 
in economics as well as the reason for this need that he saw in the 
essentially applied and practical nature of the subject. The supporting 
illustrations he gave, drawn from his discussion of limitations on the 
duties of the state, contain advice the validity of which stands as firmly 
today as it did in 1819 when first written (p. 15). This introduction 
concisely reflects the qualities Robbins praised in Malthus in his non- 
rigid thinking, his love of truth, and his profound humanity. 

A sample of Malthus’s capacity for perceptive economic analysis 
can be found in his discussion of the consequences of an increased 
paper money supply for prices, activity levels, and the distribution of 
income, largely intended to illuminate the extent and the manner in 
which “an increase of currency tends to increase capital” (7:46-50; cf. 
pp. 74-75). This discussion arose from a need to reconcile the practi¬ 
cal views of merchants and manufacturers on their perceived ability 
to increase their productive capital through a loan of paper money 
with the opinion expressed in the literature that such transactions can 
in no way increase the capital of the country. Malthus’s explanation, 
then still relatively novel, involved forced saving via the consequences 
of the distribution of the additional circulating medium on prices and 
profits in the short run and increased potential for accumulation 
from the output effects of this increased capital over the longer run. 
Although his view is different from Richard Cantillon and David 
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Hume’s view on the stimulus to activity from increased money supply 
(MaJthus assumes fully employed resources), he nevertheless used it 
to settle a conflict of opinion between Hume and Adam Smith on the 
matter in Hume’s favor on the basis of an episode of Scottish eco¬ 
nomic history of the 1750s, taking these data also as confirmation of 
his theory (p. 49). This monetary discussion, inspired by the bullion 
controversy, also supports the well-known note in the Principles 
(6:260, n. 10) on the importance of money to the analysis of economic 
growth because of its effects on the distribution of wealth and the 
encouragement of economic activity. This seems to be one aspect of 
Malthus’s work that clearly resembles that of Keynes because it places 
much emphasis on the importance of analyzing a monetary economy 
and avoiding the dangers of seeing money as a veil. 

Malthus's use of empirical data, for which he has also often been 
praised, is perhaps best illustrated by reference to his ultimate piece 
on the population theory that was contributed in 1824 to the supple¬ 
ments of the Encyclopaedia Britannka. In this work, by careful use of 
the available U.S. demographic statistics (including making the neces¬ 
sary corrections for immigration), he was able to show the possibility 
of a country’s actually doubling its population every 25 years for a 
considerable period of time. This, in combination with evidence ob¬ 
tained from the early nineteenth censuses of England and Wales, 
allowed Malthus to derive a general result. From these data he felt it 
safe to assert “that population, when -unchecked, increases in a 
geometrical proportion of such a nature as to double itself every 25 
years,” an indication for him that it was dearly possible for mankind 
to increase at this specific rate (4:184-93). However, the article’s em¬ 
phasis on demographic data to substantiate the famous geometrical 
progression of increases in population can be contrasted with the 
complete absence of empirical evidence to substantiate the prediction 
that agricultural productivity could not possibly rise eightfold over 
the next two centuries (p. 194). This perhaps reflects his capacity for 
inconsistency noted by Stigler. After all, this unsubstantiated predic¬ 
tion about agricultural production in the work for which he always 
was largely noted sits uneasily with his remarks on the dangers in 
making economic predictions quoted earlier from the introduction to 
his Principles. 

The foregoing is only an indication of the variety and interest of the 
Malthus material gathered in these collected works. They can easily 
be increased. For example, a quite different Malthus is presented in 
his defense of the East India College, where he was professor of 
political economy. The defense has a clarity of purpose and style that 
some present university administrators would envy. However, it is 
dearly impossible to review all facets of Malthus’s work presented in 
the eight volumes of his collected writings. 
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What strategy should the reviewer of the collected works of a classi¬ 
cal economic writer then adopt? Taking a leaf from Stigler’s review of 
Sraffa’s Ricardo (Stigler 1965, esp. pp. 302-3), I will proceed as fol¬ 
lows. 1 will first evaluate the quality of the edition. Next, 1 will review 
some recent work on Malthus’s theory of effectual demand and ac¬ 
cumulation in the context of the material presented in this edition. 
This is designed to highlight the edition’s usefulness for settling unre¬ 
solved questions about Malthus’s work, a matter of interest to the 
economist as well as to the historian of economic thought. 

I. Quality of the Edition 

After the publication of the Sraffa edition of the works and corre¬ 
spondence of David Ricardo (1951-73), reviewers have established 
nearly an absolute standard of excellence for measuring the quality of 
similar ventures. That edition set virtually unbeatable records for 
comprehensiveness, 3 degree of accuracy in transcriptions and prepa¬ 
ration of variorum texts, and scholarship in editorial notes and in¬ 
troductions. Last, but by no means least, its contents include a superb 
general index, though this did not appear until more than two de¬ 
cades after the publication of the initial volumes of text. In the words 
of one of its reviewers (Stigler 1965, pp. 303-4), Piero Sraffa’s work 
exhibited “extraordinary . . . accuracy,’’ “superb . . . editorial notes,” 
and “superlative quality of scholarship,” qualities in the light of subse¬ 
quent experience difficult to emulate, let alone surpass. Therefore, 
the Sraffa edition of Ricardo acts as an absolute standard of quality in 
evaluating like ventures in terms of their departures from that ideal. 

Some specific features of the edition may be mentioned first. The 
texts have been reset to modern standards and the eight volumes 
aesthetically bound in uniform style. As detailed later, this resetting 
has imposed costs. The edition allows identification of the pagination 
of original editions actually reprinted while continuous pagination for 
each volume (at the bottom of the page) meets the needs of the gen¬ 
eral index. However, pagination of editions not reprinted, like the 
first edition of the Principles, cannot be identified. Further, English 
translations from foreign-language quotations are not invariably pro¬ 
vided (e.g., 3:505; 5:25, 112; 6:270-75; 7:117; 8:115-16). This is one 
of a number of cases of not matching promise with performance. The 


Both the comprehensiveness and accuracy of.Sraffa's Ricardo can be assessed by the 
small number of corrections and omissions the editor subsequently had to report 
(Ricardo 1951 -73,10:411; 11 :ix-xxxii). Another omission from Sraffa's Ricardo, and 
m>e associated with the subject matter of this article, is Ricardo’s commentary on 
Malthus's Measure of Value, published in Porta (1979). Its existence is not mentioned as 
relevant information in the introduction to that essay on political economy by Malthus 
m this edition (7:10). 
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general index (8:125-65) is very comprehensive and surpasses in 
quality the list of sources Malthus used, another feature of this edi¬ 
tion. 

Completeness has to be judged also. The edition's intended cover¬ 
age (1:7, 12-13) was explicitly confined to “the published writings of 
Thomas Robert Malthus,” thus excluding thereby correspondence 
and travel diaries. Although much of the important Malthus-Ricardo 
correspondence is already in print (Ricardo 1951-73, vols. 6-9,11) as 
are the more important travel diaries (James 1966), there is much 
other correspondence extant, at least some of which (e.g., that with 
Francis Horner, Thomas Chalmers, and Henry Parnell) is of interest 
to economists and has been published in forms of varying accessibility 
(see, e.g., Zinke 1942; McCleary 1953; de Marchi and Sturges 1973). 
In addition, unpublished manuscripts, travel diaries, personal papers, 
and family and other correspondence, including the substantial cor¬ 
respondence with his father Daniel (long believed to have been lost), 
have recendy been rediscovered. 4 

There are omissions other than the manuscript material just men¬ 
tioned. One example is the evidence Malthus gave to select parlia¬ 
mentary committees in the 1820s, the two occasions'’ when he did so 
both being of interest to economists. A second regrettable omission is 
the Inverarity manuscript with its series of questions on Adam Smith 
Malthus put to his students at Haiieybury College. This omission is all 
the more regrettable because it is one of the more detailed sources for 
Malthus’s views on the Physiocrats (Pullen 1981; cf. Bonar 1924, pp. 
427-28). A final class of omission covers published articles attributed 
to Malthus. Although I will ignore the list of articles in the British 
Critic Rashid (1982, p. 25) attributes to Malthus on slender evidence, 
two articles in the Edinburgh Review of 1808 and 1810 for which at¬ 
tribution to Malthus is firmer need to be mentioned. This edition (vol. 
4, essays 2, 3, and 6; vol. 7, essays 2 and 3) reprints the five articles 
from the Edinburgh Review attributed to him on which there is no 


* In 1986, Maggs Brothers, Ltd., booksellers of Berkeley Square, London, offered 
for sale what they describe as the "Malthus Archive,” consisting of the remaining 
manuscripts and correspondence of Malthus and his family discovered in the home of 
the late Robert Malthus in the Isle of Wight. The catalog lists a substantial number ol 
letters, as well as draft manuscripts including several on economic topics. These include 
an early essay on colonies (circa 1800), notes on taxation, and a variety of papers on 
monetary matters. This information was kindly supplied by John Pullen. 

* The first was the evidence to the Committee on Artizens, Machinery and Combina 
tions given by Malthus on May 10, 1824 (see Gordon 1979, p. 30; James 1979, p. 391) 
which among other things puts Malthus's economic views on combinations in a mon 
liberal perspective. His May 1827 evidence to the Committee on Emigration could hav 
been usefully included with his essays on population (vol. 4), to which much of *«' 
relevant. It also contains interesting changes in Malthus's policy views on public wori 
(Bonar 1924, pp. 195, 240-41; James 1979, pp. 391-96). 
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dispute, but the exclusion of the two others (see Semmel 1963, pp. 
14-16) is at least debatable. 6 

Other aspects of the quality of the edition need examination. Take 
first E. A. Wrigley’s general introduction (1:7-39). It starts with a 
brief statement of editorial policy and the “limited” objectives of the 
edition, continues with a sketch of Malthus’s life, and concludes with 
an economic historian’s evaluative sketch of his work, emphasizing 
that on population. It provides some useful insights. Highlighting 
Malthus’s struggle between “clarity" and “comprehensiveness” as well 
as the concluding statement relating Malthus’s determination “to 
ground speculation in an empirical framework” to a requirement “to 
read and to judge” his work “in the context of his times” (p. 39) seems 
useful information for those about to embark on a Malthus reading 
marathon. However, the “interpretative” rather than “factual” nature 
of the introduction renders it liable to the more rapid obsolescence to 
which views on the importance of a particular social scientist are so 
often prone, a danger enhanced when the person whose views are 
investigated is a controversial figure like Malthus. Such “factual” ne¬ 
glect also reduces the potential value to its readers. 7 

Introductions to individual volumes leave much to be desired. One 
example will suffice. David Souden’s introduction (7:7-11) to the 10 
essays on political economy is an outstanding example of editorial 
thrift and is confined to only four and a half pages. Compared with 
the information M. H. Dobb and Sraffa provide on Ricardo’s bullion 
papers (and useful as a starting point for those interested in Malthus's 


n The essays in question arc a review of John Ingram's Disquisitions on Population and 
William Hazlitt's Letters in Reply to Malthus ( Edinburgh Review , August 1810) and one of 
William Spence on commerce (January 1808). Bonar (1924, pp. 33 n. 2, 329 n. 3) 
suggests Malthus as the author for the former, though he admits that Francis Jeffrey 
probably provided his customary “head and tail” to disguise authorship. Fetter (1953, 
p. 247) declines to make such a positive attribution because the essay speaks too well of 
Malthus’s own work on population, and subsequent scholarship supports Fetter rather 
than Bonar (Semmel 1963; James 1979; Pullen 19875). The article on Spence on 
commerce is more firmly and widely attributed to Malthus (Fetter 1953: Henderson 
1984; Pullen 19876), but not by Semmel (1963). Fetter relies on Horner's corre- 

■ spondence with Jeffrey (quoted by James 1979, pp. 149-50); Pullen (19876) adds 
j internal evidence and more tellingly Henry Brougham's direct attribution to Malthus in 

his list of contributors to the Review now at University College, London. Referencing in 
| this essay and Malthus’s known work further supports an attribution, particularly the 
reference to Berkeley's “wall of brass” ( Edinburgh Review, January 1808, p. 447), used in 
Malthus’s Essay on Population (3:403) and derived from Berkeley’s Querist (query 134 

■ ln Johnston (1970, p. 136]), of which Malthus had a copy in his library (Harrison 1983, 
P- 14). 

Perhaps a wise decision given the factual weaknesses in the introduction commenc- 
1 ! n 8 "kh die bibliographical note (p. 7), which contains a number of errors. Likewise the 
tactual material on Malthus the demographer and economist is scanty. For example, 
>tue attempt has been made to flesh out thevain in Mahhus's demographic knowledge 
tween 1798 and 1802, to which reference is made (1:22; 3:7-9). 
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essays 2 and 3 in this volume), Souden’s introduction is a veritable 
desert of noninformation. For example, no details are given on 
Malthus’s inspiration for the essays in question, the background to 
their immediate occasion, nor, for that matter, the number of print¬ 
ings or editions they enjoyed, the reception they received, and, where 
relevant, the version from which the present reprint was made. 8 On 
this score, Pickering’s edition has very little to offer. 

The usefulness and consistency of the editorial notes are on a par 
with the quality of the introductions. This arises in part from the 
edition’s dubious practice of altering Malthus’s system of referencing 
“into a modern form ... though this entailed altering and extending 
the original text of the footnote” (but without making it possible for 
the reader to see how the original reference was made apart from 
consulting the original text; see 1:9-10). Apart from possible wrong 
identification of such references inherent in this practice, the system 
is far from consistently applied . y Likewise, cross-references to the 
pagination of the current editions are provided without any real con¬ 
sistency (e.g., cf. 7:198 n. 12 and n. a thereto with pp. 191 n. 9, 210 n. 
19, 212 n. 20, and 213 n. 20). With a collected Malthus not likely to be 
repeated, this editorial weakness seems an important “lost opportu¬ 
nity," though the time and effort involved to remedy these shortcom¬ 
ings would have been fairly substantial. 

More serious is the absence of editorial notes (so liberally included 
in Sraffa’s Ricardo) providing real background to the reader, an omis¬ 
sion all the more strange given the argument of the general introduc¬ 
tion that Malthus’s views should be judged within their context. Ab¬ 
sence of such notes presumes also an enormous general knowledge 
on the part of the reader with respect to early nineteenth-century 
current affairs and classical Greek and Roman literature. Even when 
compared with Jacob Hollander’s 1903 reprint of Malthus’s Inquiry 
into the Nature and Progress of Rent, Pickering’s version scores badly. Its 
two editorial notes (7:127 n. a, 142 n. b) only correct minor misprints 
in the original (neither of which was corrected by Hollander), but this 


* An example is Maithus's Observations on the Effects of the Com Laws, which, as shown 
by James (1979, p. 253), was reprinted immediately with minor alterations and went 
through one further revision. This added three pages to the original 44, including 
some notes (p. 256). No indication of any of this is presented to the reader of this 
collection. John Pullen has reminded me that the second, posthumous edition of the 
Definitions ought also to be noted in this context. 

9 An example of wrong identification is the reference to John C. Curwen’s pis' 1 
(3:551), which confuses his 1808 pamphlet with speeches in Parliament in which h* 
outlined the plan (May 28, 1816, and January 21, 1817). An example of inconsistent 
practice is given by no fewer than seven notes in the reprint of The Measure of Vattu 
(7:180 n. 3; 182 n. 4; 186 n. 7; 189 n. 8; 215 n. 23), which fail to follow modern usage oj 
not providing page references to the works cited. 
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version ignores the misprints identified by Hollander, including a 
wrong page number in a reference, and it makes no attempt to match 
the information in Hollander's 15 additional editorial notes. 10 

Judgment about the variorum edition is best left till the Royal Eco¬ 
nomic Society Malthus ventures are published. It may be noted, how¬ 
ever, that Pickering's edition invariably compares only two versions of 
the works it gives variorum treatment. For the Essay on Population this 
means that changes in the third (1806), fourth (1807), and fifth (1817) 
editions cannot be tracked, and it is therefore impossible to system¬ 
atically trace the evolution of Malthus’s thoughts on population 
through the three decades in which they developed. 

Although a variorum edition is perhaps of little use to most econo¬ 
mists, this omission in the variorum treatment creates problems in 
trying to answer questions in which there may be greater interest. 
Pickering’s Malthus, for example, easily allows an assessment of the 
use Malthus made in the sixth edition of the 1801, 1811, and 1821 
population data made available by the census established by act of 
Parliament in 1800. However, it cannot indicate the extent to which 
he was induced by the existence of these data to alter his theory in 
particular respects. Generally speaking, as is in any case well known, 
he took these new data as confirmation of his principle of population 
because they so clearly showed the potential for population growth in 
a rapidly developing country such as England during these decades of 
the nineteenth century. Availability of these data also changed his 
more conservative perception of population growth in modern times 
from that given in the first essay (1:26), where it was quite wrongly 
described as “slow’’ if not stationary or "retrograde" because of the 
efficacy of the checks on population. 

The major change between the first and subsequent editions of the 
Essay was the addition of a further check on population, which 
Malthus called “moral restraint” (2:iii). This was a significant change 
to the argument because it reopened the door to the possibilities of 
human progress that the first essay appeared to have shut so firmly 
against the optimistic speculations in this respect of William Godwin, 
the Marquis de Condorcet, and even Adam Smith. A specific feature 
of this change of heart on Malthus's part was the increasing emphasis 
on the benefits of education for the poor in the later editions. These 
not only arose from dissemination of the advantages of moral re¬ 
straint; they also enabled a wider understanding of the laws of polit- 


See Hollander (1903, pp. 50-51. nn. 6, 8, 9, 13, 19). Note 9 draws attention to an 
P*K e reference to Smith's Wealth of Sations in the Buchanan edition, which 
* ould read p. 272 instead of p. 212 as given in the original text of Malthus (and 
taithfully reprinted in vol. 7, p. 118, n. 5). 
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ical economy among the working classes. Maithus (3:526n) suggested 
that ignorance of these laws was “not merely a deprivation of good 
but produces great positive evil,” a sentiment likely to be endorsed by 
most readers of this Journal. Since the economics of education and the 
education of economics are not subjects for which population Maithus 
was greatly known, two brief quotations on this subject from the later 
essay on population may be permitted: 

It is particularly gratifying to me, at the end of the year 1825, 
to see that what I stated as so desirable twenty two years ago, 
seems to be now on the eve of its accomplishment. The in¬ 
creasing attention which in the interval has been paid gener¬ 
ally to the science of political economy; the lectures which 
have been given at Cambridge, London, and Liverpool; the 
chair which has lately been established at Oxford; the pro¬ 
jected university in the metropolis; and above all, the Me¬ 
chanics’ Institution, open the fairest prospect that, within a 
moderate period of time, the fundamental principles of 
political economy will, to a very useful extent, be known to 
the higher, middle, and a most important portion of the 
working classes of society in England. [3:526n; cf. 4:79-80] 

Much might be expected from a better and more general 
system of education. Everything that can be done in this way 
has indeed a very peculiar value; because education is one of 
those advantages, which not only all may share without inter¬ 
fering with each other, but the raising of one person may 
actually contribute to the raising of others. If, for instance, a 
man by education acquires that decent kind of pride and 
those juster habits of thinking, which will prevent him from 
burdening society with a family of children which he cannot 
support. [3:562-63] 

The second edition of the Essay on Population (2:ii) also drew atten¬ 
tion to Malthus’s use of sources. An important feature of Pickering’s 
Maithus is the emphasis its editors have placed on providing max¬ 
imum information on the printed sources Maithus used in his writ¬ 
ings. These are included not only with each subset of volumes. A 
consolidated list is provided (1:61-74) as well as a comparative table 
of authorities used by Maithus in the six successive editions of his 
Essay on Population (pp. 53-59). The editors’ preparation of this bib¬ 
liographical material is acknowledged to have been greatly assisted by 
the work of John R. Harrison, the historical bibliographer who was 
involved in the publication of the Maithus Library Catalogue covering 
the personal book collection associated with Maithus now held at 
Jesus College, Cambridge (Harrison 1983). 
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As with other editorial aspects of this work, this in principle very 
useful information suffers from some shortcomings. It is disappoint¬ 
ing, for example, that frequency of citation of individual authorities is 
not indicated in the source lists appended to specific volumes and that 
the requisite page references are not given in the lists. In addition, 
they contain bibliographical errors and are incomplete. 11 In spite of 
these deficiencies, such a list of sources has a number of uses, includ¬ 
ing assessing Malthus’s knowledge of published Physiocratic work, an 
issue that continues to attract attention (see Thweatt 1987). 

On accuracy of the text as printed and quality of the proofreading, 
it is fortunately possible to be more complimentary. However, their 
serious testing poses some difficulties that arise from the editors’ de¬ 
liberate “interventionist" policy to modernize the text in various re¬ 
spects. These include not only spelling, punctuation, and the use of 
capitals and italics, based on practice enshrined in the thirty-ninth 
edition of Hart’s Rules for Compositors and Readers at the University Press 
(1:11, n. 7), but, more important, an attempt to introduce consistency 
in the printing of numbers in words or digits and the modernization 
of geographical and ethnographic proper names (such as Tahiti for 
Otaheite and Bedouin for Bedoween). The problem with this is that 
today’s modernity is tomorrow’s anachronism, an adage confirmed by 
the fact that Hart’s rules for Oxford University Press usage have 
enjoyed no fewer than 39 editions between 1893 and 1983. The 
editors suggest that such changes were kept to a minimum. An indica¬ 
tion of what that meant in practice was obtained by textual compari¬ 
son of the original text of one of Malthus’s essays with the version 
printed by Pickering. This disclosed no fewer than 140 changes, of 
which at least two appear more serious. 12 Most of the changes confirm 
the interventionist editorial standards explicitly adopted. 


11 This is remedied only in part by the general index, which fails to include entries 
for works cited by Malthus, confining itself to including their authors as simple name 
entries. Assessing frequency of citations will therefore require a considerable amount 
of work. It may also be noted that apart from the general index (8:125-65), the only 
volumes with an index are those that reproduce the original index of the works re¬ 
printed (i.e., 3:625-57; 6:351-60). Wrong bibliographical information in the checklist 
is easily illustrated. Items are included as anonymous for which firm author attributions 
are available. Examples are the 1819 Edinburgh Review article on Robert Owen's plan 
for relieving the national distress, now generally assigned to Robert Torrens, and the 
*825 Westminster Review papeT on the corn laws now generally attributed to John Stuart 
Mill. Some entries present wrong or ambiguous information, e.g., A. R. J. Turgot’s 
Rtftexions in the 1788 edition, Daniel Defoe's Giving Alms No Chanty, Pierre Samuel Du 
Pom's Physwcratie, and Josiah Tucker’s Elements of Commerce and Theory of Taxes. There 
»re also surprising omissions, of which Ricardo's Proposals for an Economical and Secure 
'‘Unency and Bishop Berkeley's The Querist are the more notable. 

The essay compared is the review on political economy (7:257-97), which origi- 
n *lly appeared in the Quarterly Review. On this matter 1 am indebted to jack Towe for 
re «arch assistance, here gratefully acknowledged. The two serious errors he found 
occur in vol. 7, p. 274, line 7, which changes the original "plat” to "plait," and p. 284, 
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Aesthetically pleasing seems therefore a better way to describe the 
quality of the edition than scholarly accuracy and proficiency. Its 
deficiencies in these respects considerably mar its potential usefulness 
to Malthus scholarship. However, unlike Sraffa’s Ricardo, this edition 
of Malthus's works has not had the benefit of lavish subsidy from a 
learned society and was completed in about 2 years rather than the 20 
years required by the former. 

II. Malthus on Gluts, Accumulation, 
and Effectual Demand 

In a detailed comment on a survey of recent Malthus literature, Pul¬ 
len (1987a) expressed the hope that “the future of Malthus studies 
looks promising” partly because Pickering’s collection will facilitate 
access to all of Malthus’s texts, thereby counteracting the tendency 
toward partial interpretations based exclusively on the Essay or the 
Principles. This cause of confused interpretation of Malthus is exacer¬ 
bated, Pullen argues, by the fact that these major works went through 
various editions containing considerable revisions. This section con¬ 
centrates on the first potential benefit Pullen associates with Pick¬ 
ering’s Malthus: the extent to which the presence of the bulk of 
Malthus’s work in a collected edition can assist solutions to conflicting 
interpretations of Malthus. Recent literature on Malthus on gluts, 
accumulation, and effectual demand was taken as a sample to test his 
hypothesis. 

Not only the literature sample but also the range of questions to be 
asked are limited if only for reasons of space. The sample is confined 
to seven contributions: Samuel Hollander (1969, 1979), Bleaney 
(1976), Costabile (1980, 1983), Eltis (1984), and Costabile and Row- 
thorn (1985). This small sample contains considerable scope for ex¬ 
plicit controversy (e.g., Eltis 1984, pp. 177-81; Costabile and Row- 
thorn 1985, pp. 420 nn. 1, 2; 421 nn. 1, 2; 423 n. 2). The questions 
asked address the extent to which differences in opinion expressed in 
these writings were seen capable of being resolved by textual evi¬ 
dence, hence potentially assisted through the publication of Picker 
ing’s Malthus. They do not shed light on the controversies themselves 

What is of concern, therefore, is the practice of the authors them 
selves in using a variety of Malthus’s works to settle problems o 
textual exegesis and general interpretation. Works of Malthus citec 
and their frequency clearly give a due to the importance they as 
sighed to availability of all his works. Table 1 presents these data in: 
convenient form. 


line 10, which omits “any” before "new capital." The three to four error* per pag 
compare unfavorably with Sraffa’s record as disclosed by Stigler (1965, pp. 305-4) 
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Apart from fairly predictable conclusions, such as the fact that 
Hollander (1969) refers to the greatest number of different works by 
Malthus (11 of the 15 mentioned by the sample as a whole), tying 
second with Eltis (1984) with nine of 15 for his (1979) discussion, the 
data shed some interesting light on the potential usefulness of the 
Pickering collection to Malthus scholars. Of the 15 items by Malthus 
mentioned in the sample, 12 are contained in the work under review, 
and these include two of the most frequently cited references: the 
second (1836) edition of the Principles and the Definitions (1827) (177 
and 22 references, respectively), and if the first edition of the Princi¬ 
ples (1820) is included, 29 citations are added. Whether adding the 
first edition is legitimate raises questions on the value of the variorum 
quality of Pickering and some other matters discussed subsequently. 
The major omission from Pickering is the Malthus-Ricardo corre¬ 
spondence, which gets 43 citations and is rightly regarded as an indis¬ 
pensable source by researchers of this question. Together with the 
Principles, it is the only source common to all seven articles or books. 
Another item of Malthus correspondence, that with Pierre Provost, 
gets only one really independent citation in this context, 13 in Hollan¬ 
der (1969); the same applies to the evidence by Malthus to the 1827 
emigration committee, omitted from Pickering. On aggregate, works 
cited from the essays on political economy (vol. 7) account for 19 
citations from seven separate essays and the Essay on Population for a 
further 26, largely because Hollander (1969, 1979), Eltis (1984), and 
Costabile and Rowthorn (1985) include wages as part of their inquiry. 

The data in the table also shed light on the relevance of access to 
variorum editions of a work for serious Malthus scholars. This seems 
only to have been important in the case of the two editions of the 
Principles used by all researchers with varying degrees of discrimina¬ 
tion. 14 At one side of the spectrum, Bleaney regarded the matter of 
choice between editions in this context with pure indifference, argu¬ 
ing explicitly that “no important changes were made on the subject 
with which we are concerned" and that hence there is no significance 
to his almost symmetrical alternative switching between the two edi¬ 
tions in his citations (Bleaney 1976, p. 60). On the other hand, where 
relevant, Hollander carefully indicates when there are variations be- 


13 Both Hollander (1979) and Eltis (1984) cite Hollander (1969) in the context in 
which reference to the Mai thus-Prdvost correspondence is made. 

M The exception of Costabile (1980) is probably explained by her use of editions for 
which Italian translations are available. There is an Italian translation of the second 
edition, edited by P. Barucci, but there is no translation of the first (see Costabiie 1980. 
p. 138). In this context George Stigier has noted that, in his view, the value of variorum 
editions is grossly overrated because “priority should always be given to the edition. • - 
contemporaries read.” This opinion justifies neglect of the posthumous second edition 
of Mahhus’s Principles, the text in fact reprinted in Pickering. 
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tween the editions (e.g., 1979, p. 529, n. 168), while Eitis's use of the 
two editions clearly suggests that for him they do contain significant 
changes of relevance to his research on the subject. Costabile (1983) 
(and in her paper with Rowthorn [1985]), while having a clear prefer' 
ence for using the second edition, signals changes of importance from 
the first edition. It may be added that use of the first edition is en¬ 
hanced in this context by the fact that the version most frequently 
used is the one partially reprinted in Ricardo (1951-73, vol. 2), which 
contains Ricardo’s notes on Malthus. Generally speaking, access to a 
variorum edition of Malthus’s Principles is clearly of value to research¬ 
ers in this sample, and Pickering's edition (vois. 5 and 6) has filled at 
least part pf this gap. 

Obtaining a spread of Malthus’s works was also of importance to 
researchers from the investigation conducted here. This is particu¬ 
larly the case for Hollander (1969), who was praised for that very 
reason by Eltis (1984, p. 349, n. 21) and Costabile and Rowthorn 
(1985, p. 424, n. 5). In the light of Pullen’s comment (1987a) quoted 
at the start of this section, it is also interesting to note that Malthus’s 
Essay on Population is relevant to five of the seven pieces. This is not 
surprising since the last two editions cover the post-Napoleonic war 
depression in Britain (vol. 2, p. vi, which indicates changes made in 
the fifth edition [1817] partly reflecting this changed circumstance). 
By contrast, it is surprising that only Hollander (1969) cited Malthus’s 
High Price of Provisions (vol. 7, essay I), despite the extravagant praise 
Keynes (1972, pp. 88-90) gave it in this context. However, with refer¬ 
ence to the monetary aspects of the glut debate, of which Malthus 
himself was in no doubt (6:260, n. 10), the approach of Costabile, 
Eltis, Costabile and Rowthorn, and Hollander, which explicitly draw 
on Malthus’s monetary writings for that purpose, seems preferable to 
Bleaney’s more sparse approach. Likewise, the data clearly support 
the value of Malthus’s Definitions (vol. 8) in this context, and Eltis 
(1984, p. 348, n. 6) for that reason defends it against an unfortunate 
remark on its quality in James (1979, p. 410), inspired by Keynes 
(1972, p. 92, n. 1). 

From the perspective of its usefulness to Malthus scholars, this part 
of the inquiry into Pickering’s Malthus can be concluded by making 
one further comment on the value of integrating the views of the 
author on the Essay on Population with the economist of the Principles. 
The Essay on Population contains a striking passage on taxation in its 
relation to demand, which in some of its aspects makes Malthus more 
a pioneer of Reaganomics than the embryonic Keynesian he is so 
often painted to be. The passage is quoted without further comment, 
apart from a reminder that it gives further support to a corollary 
from Jacob Viner’s dictum that it has to be very peculiar economic 
doctrine if it cannot find support from a noted economic authority: 
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TABLE 2 

Alternative Purchase Opportunity of Malthus’s Major Works 


Edition Price (U.S. $) 


Essay on Papulation (variorum ed., edited by Patricia James), 

Cambridge Univ. Press (for Royal Econ. Soc), 1987; £60* 

Principles of Political Economy (variorum ed., edited by John 
Pullen), Cambridge Univ. Press (for Royal Econ. Soc.), 1987; 

£60* $190.00 

First Essay on Population, Kelley* 29.50 

Definitions of Political Economy, Kelley 29.50 

The Measure of Value, Kelley* 29.50 

The Pamphlets of T. R. Malthas, Kelley 35.00* 

Five Papers on Political Economy, reprint of Economic Classics. 

ser. 1, no. 3, Dept. Econ., Univ. Sydney; $A 4.00 2.75 

Review of Bullion Controversy 1811, reprint of Economic Classics, 
ser. 1, no. 2, Dept. Econ., Univ. Sydney; $A 2.00 1.40 


$320.00* 


Non — Thu reproduce* ail the Malthus texts in Pickering's collection except lor essays 2. 5. 6, and 7 in vn{. 4 

• Publisher's estimated price only. 

* Not included in their roost recent catalog. 

1 At bargain pnee of $12.50 for this item, the total rost falls to $295.00 


The effects of taxation are no doubt in many cases perni¬ 
cious in a very high degree; but it may be laid down as a rule 
which has few exceptions, that the relief obtained by taking 
off a tax, is in no respect equal to the injury inflicted in laying 
it on; and generally it may be said that the specific evil of 
taxation consists in the check which it gives to production, 
rather than the diminution which it occasions in demand. 
[3:378] 

III. Conclusions 

The quality and usefulness of the edition to prospective buyers hav¬ 
ing been discussed, the matter of price must be briefly considered to 
complete the advice on this consumer choice problem in Malthus 
editions. The eight volumes of Pickering’s Malthus, which have to be 
treated as a package since they are not sold by single volumes, cost 
£360 ($570) at the approximate rate of exchange in mid-March 1987. 
For once a near-perfect if not more perfect substitudon is available 
from a combinadon of other edidons. The data on this are given in 
table 2. 

Table 2 shows that nearly every Malthus text included in Picker- 
ing, ls but not, of course, the introductions, general index, and list of 


18 Texu included in Pickering and not currently available in reprint are the two 
review* dealing with Ireland (Edinburgh Review, 1808, 1809), the review of Godwin on 
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printed sources, can be bought for close to half the price that Picker¬ 
ing charges for its edition. From discussions in earlier parts of this 
article, it has been demonstrated that mainly aesthetics of the book¬ 
shelf are sacrificed if the alternative package delineated in table 2 is 
purchased and that in buying the first two items produced by the 
Royal Economic Society and Cambridge University Press the buyer is 
likely to gain in scholarly quality with respect to the variorum work, 
editorial notes, and introductions. It is a pity that Pickering’s inter¬ 
esting new venture has traded off speedy completion for editorial 
quality, though this choice also reflects an economic climate changed 
from that experienced by Sraffa. However, economists interested in 
Malthus can only profit from these benefits of competition in the 
expensive market of reprints of economic classics, which this has pro¬ 
duced. 
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Old South, New South: Revolutions in the Southern Economy since the Civil War. By 

Gavin Wright. 

New York: Basic Books, 1986. Pp. x + 321. $19.95. 

In Old South, New South, Gavin Wright sets out to explain how what had been 
the “most backward and impoverished section of the United States” until 
1940 came to be a region that since that date “has persistently outpaced the 
rest of the nation in the growth of incomes, industry, jobs, commerce, con¬ 
struction, and education” (p. 3). His search for causes begins in the antebel¬ 
lum period that was the primary focus of his earlier Political Economy of the 
Cotton South, and this sequel carries the story to the present. 

The central focus of Old South, New South is on the operation of labor 
markets, for Wright believes that the isolation of the southern labor market 
was the key to the region's economic history up to the New Deal, while the 
integration of the South into the national labor market was the catalyst for its 
recent growth. Yet although the book’s organizing scheme centers on the 
labor market, Wright’s view ranges widely over the entire southern economy, 
and the book provides a broad economic history of the region over more than 
a century. Although the topics treated are familiar ones and have long been 
discussed by economists and historians, in his version many appear in a new 
light, and new relations and connections appear among old actors.'Indeed 
few existing interpretations of southern economic history are not at least 
modified or given a new twist by Wright's interpretations. Although a short 
summary cannot convey the intricacy of many of his arguments, the following 
account can serve to suggest some of the main features of his story. 

Wright begins with a novel interpretation of the effects of slavery on the 
antebellum southern economy. Sidestepping recent discussions of the effi¬ 
ciency of slavery, he argues that slavery caused a fundamental divergence 
between the northern and southern economies by weakening the link be¬ 
tween wealth accumulation and land. In the North, the energy of entrepre¬ 
neurs focused on ways to increase land values; these included the construc¬ 
tion of roads and railroads, towns and villages, and schools and factories, the 
search for precious metals and mineral deposits, and the attraction of new 
settlers. In the South, however, slave owners could repeatedly migrate, taking 
their slaves with them to grow cotton in areas of higher fertility, thereby 
increasing their own wealth without engaging in the local development activi¬ 
ties that made the North the socially progressive region of the early nine¬ 
teenth century. 

After the Civil War this difference in incentives disappeared, but Wright 
argues that the southern labor market remained largely isolated from the 
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North. The region’s high fertility rates combined with a small industrial sec¬ 
tor to make the South a "low-wage region in a high-wage country" (p. 76). 
Cotton continued to dominate southern output, and the region remained 
rural and backward. Competition within southern labor markets did serve to 
eliminate black-white wage differentials for unskilled labor in the late 
nineteenth century, but discrimination prevented blacks from obtaining the 
education or on-the-job training that would have allowed them to enter 
higher-paying occupations. Wright argues strongly that competition in labor 
markets produced no tendency to break down racial criteria for occupational 
assignments and promotions. 

The rapid growth of the cotton textile industry in the South in the late 
nineteenth and early twentieth centuries did provide industrial employment 
to increasing numbers of white southerners since the early success of the 
southern industry relative to that of New England lay precisely in its ability to 
tap the large pool oflow-wage southern farm labor. The growth of the indus¬ 
try slowed in the 1920s, however, in part because of international competition 
and in part, Wright argues, because of an unwillingness of southern manufac¬ 
turers to cut the wages of their increasingly experienced workers. More gen¬ 
erally, the southern industries that grew in the early twentieth century were 
labor-intensive industries that could take advantage of the region's cheap 
unskilled labor. The South therefore missed out on the growth of those 
capital-intensive industries that constituted the most dynamic part of north¬ 
ern industrialization. 

Large-scale migration of southern blacks to the North began during World 
War I, as northern employers faced by labor shortages due to the decline in 
European immigration actively began to recruit southern workers for their 
factories. Wright argues, however, that even the migration of as many as a 
million southerners to the North during 1915-20 did not lead to the creation 
of a national labor market, with increased southern wages; he contends that 
the predominance of urban and industrial workers among the migrants left 
the region’s low-wage agricultural sector unaffected. The bulk of unskilled 
southern workers remained caught in a regional labor market. 

Fundamental change came only with government intervention. A series of 
New Deal actions, beginning in 1933 with the National Industrial Recovery 
Act, began to impose national minimum wage standards on southern indus¬ 
tries. Much of the immediate burden of rising wages was borne by black 
workers who were displaced by whites. Yet both this displacement of black 
industrial workers and the increasing use of tractors in southern agriculture 
for preharvest operations during the 1930s produced a declining southern 
demand for unskilled labor that led to a renewed migration, as many south¬ 
ern blacks were pushed to the North in the forties. In Wright’s view, this was 
the turning point that marked the beginning of the end of the political econ¬ 
omy of the Old South. He further argues that the military labor demands of 
World War II and wartime northern labor shortages created a pull for south¬ 
ern workers that advanced the integration of northern and southern labor 
markets. The rising wages that then resulted from the new scarcity of harvest 
labor in the South led to a concentration of efforts to develop a mechanical 
harvester for cotton. When this breakthrough came, the 1950s saw a further 
sharp drop in the southern demand for agricultural labor, and more rural 
southerners left for the North. Extended coverage of federal minimum wage 
legislation further reinforced the tendency toward integration of the national 
labor market during the fifties and sixties. 

The implications for the South were profound. As the South ceased to be a 
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low-wage region, the nature of its growth changed. Urbanization and the 
recruitment of technologically advanced northern industries became the new 
economic imperatives. The eventual success of the civil rights movement in 
the South was achieved as the result of the desire of southern communities to 
appear attractive to northern entrepreneurs in order to gain local investment 
and other economic benefits of a good reputation. 

As may be apparent from the preceding summary, Old South, New South 
ranges over a very large number of issues in southern economic history, and it 
is hardly surprising that the quality of the evidence available on them varies 
considerably. Some have been studied intensively, and Wright synthesizes the 
results of scores of earlier studies in addition to providing much new re¬ 
search. Other questions have received less attention, and evidence on them is 
often elusive. In consequence, parts of Wright's account of southern eco¬ 
nomic history are accompanied by illustrative evidence, but they remain to be 
tested empirically in systematic and detailed fashion. The book should there¬ 
fore serve as a stimulus to research, for it presents many new arguments and 
challenges existing interpretations of a number of important issues. Some 
examples might be noted here of significant questions that merit further 
attention. 

For the antebellum period, an important issue involves Wright’s characteri¬ 
zation of the role of slave owners in southern economic development. At a 
logical level, even if they did migrate frequently, this would not necessarily 
imply a lack of interest on their part in raising land values, for these could be 
captured as capital gains when they moved. Empirically, recent scholarship by 
social and political historians has in fact increasingly stressed the progressive 
role of the large planters in bringing culture to the South in the form of towns 
and cities, schools, and churches (for one recent discussion, see Oakes [1984]). 
Systematic comparisons of the development of northern and southern com¬ 
munities would appear necessary to test Wright’s argument that the economic 
logic of slavery did little to encourage the growth of infrastructure in the 
South. 

Wright’s emphasis on the importance of the isolation of the southern labor 
market in the late nineteenth century should also serve to return this issue to 
the research agenda of historical economists. His argument for the existence 
of a disequilibrium between southern and northern labor markets deserves 
detailed investigation, perhaps in the form of calculating the potential returns 
to migration for individuals, taking into account both the costs of moving and 
the actual wages new workers from the South could have received in northern 
agriculture and industry. Such a study might build on the earlier research of 
Vickery (1969) and ask whether nineteenth-century black southern laborers 
could have made substantia] gains by migrating to the North. Another labor 
market issue from the same period that merits further attention is Wright’s 
argument that market forces did not produce a tendency for southern blacks 
to acquire occupational skills. Since it would appear to have been in the 
interest of employers to reduce skilled wages by allowing blacks access to 
training, the question of why they did not would seem a significant puzzle. 
Possible explanations suggested by earlier studies that might be explored 
further include a widespread taste for discrimination among white employers 
and the use of collective action by both employers and skilled while workers to 
prevent the training of blacks. 

Perhaps the most intriguing section of Old South, New South is Wright’s 
account of the rise of the modern southern economy. At its conclusion, 
Wright draws a strong lesson from the southern experience, arguing that the 
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breaking down of racial economic barriers will not result from market forces 
alone but also requires political action: “If the southern experience has any 
lesson for South Africa today, it is that the natural forces of economic prog¬ 
ress do not break racial barriers unless people speak up through every possi¬ 
ble channel” (p. 269). It is important to ask whether this conclusion follows 
from Wright’s analysis. In part, a resolution must attempt to identify the 
specific effects of particular government actions and market forces and in 
each case to try to assess the costs and benefits. Wright makes progress toward 
this goal, but his accounts of particular episodes raise questions about the 
generality of the conclusion quoted here. For example, although he gives 
federal industrial minimum wage legislation credit for beginning the integra¬ 
tion of the South into national labor markets in the 1930s, he observes that 
this legislation caused severe hardship, in the form of unemployment, for 
southern blacks (for an early recognition that the burden of minimum wage 
laws might fall particularly heavily on blacks, see Johnson [1951, p. 87]). 
Furthermore, in Wright’s account it was only as a result of market forces—the 
pull created by northern labor shortages during World War II and the push 
resulting from the development of a mechanical cotton picker—that the rural 
South finally became genuinely integrated into a national labor market (for 
further discussion, see Alston [1987]). Later, in his analysis of the 1960s, 
Wright stresses that it was the desire of southerners to attract northern invest¬ 
ment by presenting their towns and cities as “safe, civilized communities" that 
caused them to eliminate racial barriers (p. 266). Here he appears to be 
suggesting that it was the invisible hand of self-interest, rather than the inter¬ 
vention of the federal government, that led to the growth of economic oppor¬ 
tunity for blacks. Yet the two forces need not have operated in isolation. 
Recent research by Heckman (1986) points to the possibility that the federal 
government’s insistence on racial equality of economic opportunity in the 
1960s made it possible for southern employers to follow a course of action 
they considered economically desirable, but which social pressures had earlier 
foreclosed. 

Old South, New South is an important work, impressive for both its scope 
and the force of its arguments. In considering the development of the south¬ 
ern economy from slavery to the present, Gavin Wright has offered thought¬ 
ful and penetrating answers to many questions in recent American economic 
history. The importance of these issues virtually guarantees that they will 
continue to be intensively studied and hotly debated, and the clarity and 
directness of Wright’s prose will increase the attractiveness of this book to a 
wide audience of economists and other social scientists. Old South, New South 
significantly advances our understanding of the development of the modern 
southern economy, and it will surely become the standard against which 
future treatments will be measured. 


University of Chicago 


David W. Galenson 
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One of the important roles of the university in society is to provide an 
environment for smart and creative people to do research. In this 
context the institution of tenure, or almost complete job security for 
older members of a department, is a puzzle. This security surely 
reduces the effort of some professors, but, more important, it pre¬ 
vents the attainment of an optimal assignment of workers to jobs. 
Given the finite resources available to a university, the opportunity 
cost of giving tenure to an incumbent is the lost output of the younger 
people who will not be hired in the future because the funding (i.e., 
the “slot") is not available. Why are less productive older professors 
not replaced with promising young candidates? 

An analogy with professional team sports is instructive. Profes¬ 
sional athletics is similar to academics in some important ways. Ath- 
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letes undergo long and rigorous training that is almost completely 
specific to the industry. Although the rewards for success are high, 
the decision to invest in this training is risky since the probability of 
success is low. The most productive part of an athlete’s career is short 
relative to a normal working lifetime and occurs while he is young. 
The opportunity cost of filling a position on the team with an older 
athlete is again the lost output of a potential replacement since the 
number of positions on the roster (as well as on the playing held) is 
fixed. 

Despite these similarities, hiring and firing practices in professional 
sports are in stark contrast to those in academics. Older athletes, even 
former superstars, are regularly let go as soon as they become too 
expensive or their abilities fall below those of potential replacements. 
The inherent riskiness of the profession is mitigated not by guaran¬ 
teed employment but by guaranteed payments, retirement plans, and 
disability insurance. 

Since one would not like to argue that academic training is more 
specific or more intensive or inherently riskier than that in athletics or 
that people with talented minds are more risk averse than people with 
talented bodies, it is difficult to suggest that these are the reasons for 
the job security enjoyed by older professors. 1 This paper provides a 
different explanation. The basic idea is that the difference between a 
baseball team and an academic department is the way in which new 
members of the team are selected. In baseball, the team owners 
through their agents, the managers, choose who is to play. In academ¬ 
ics this task is performed by the incumbent members of the depart¬ 
ment. In an explicit model this paper derives conditions on the re¬ 
ward functions of incumbents that must be satisfied if they are to be 
willing to hire the best candidates for jobs. Academic tenure is consis¬ 
tent with these conditions while the “baseball” solution is not. Loosely, 
tenure is necessary because without it incumbents would never be 
willing to hire people who might turn out to be better than them¬ 
selves. 

The analysis is consistent with several other aspects of the academic 
environment. It provides a rationale for “tenure-track” appointments 
and says something about the standards that can be used for tenure 
decisions. The job security derived here is not absolute. Incumbents 

1 There have been several other attempts in the literature to account for academic 
tenure. The argument that tenure allows researchers to take risks is examined by Kahn 
and Ito (1986). Other papers include Harris and Weiss (1984) and Freeman (1977). 
"his paper differs from Harris and Weiss's in that tenure here is explicitly a second- 
test institution: older workers are kept on even though by the firm's best estimate they 
may be unproductive. Freeman’s story is based on risk aversion and is therefore unable 
to explain why tenure seems to be a characteristic of academic employment but not 
other jobs in which workers are risk averse. 
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can be released if they fail to meet exogenous standards of perfor¬ 
mance (i.e., engage in “gross moral turpitude”) or if the separations 
are voluntary (contract “buy-outs" or early retirement). In times of 
financial crisis, when involuntary separations are inevitable, the 
model suggests that entire departments be eliminated. This is because 
the members of one department do not choose the new hires of 
another. The framework used for the analysis is quite general, so it 
also makes predictions about the form of other organizations in which 
members have input into overall decisions. 

Section I sets up a simple model of a university with one depart¬ 
ment and examines its optimal hiring and firing policies when it col¬ 
lects its own information about the abilities of incumbents and candi¬ 
dates. In this case the solution is consistent with the practices in 
professional athletics. Section II introduces the complication that in¬ 
cumbent members of the department have better information about 
the potential of candidates than the administration. A second impor¬ 
tant assumption is that the utility an incumbent could enjoy in his best 
alternative job is also private knowledge. The implications of the con¬ 
straints that they be willing to reveal this information truthfully are 
then worked out. This is followed by a more informal discussion of 
the practices in academics and in other organizations in which incum¬ 
bents have some input into decisions. Section III presents conclu¬ 
sions. 

I 

One of the most important roles of the university in society is the 
encouragement of research that would not otherwise take place in the 
private sector. This includes the production of scientific and technical 
knowledge that cannot be appropriated as well as knowledge that 
would never be of value to firms in the private sector, such as that 
produced by research into literature, philosophy, pure mathematics, 
and public policy. The income of a university, while it may ultimately 
be influenced by the quality of its research output, seems to depend 
much more on the vagaries of government policy, the perceived ef¬ 
fect of its degrees on the future income of students, and the success of 
its athletes. Accordingly, this paper will model the university as a 
nonprofit institution with the goal of maximizing research output 
subject to a budget that is determined by other factors. 2 

To focus on the main issues we will abstract from several other 


* It is beyond the scope of this paper to model explicitly the objectives and constraints 
of university administrators. The present approach serves to focus attention on the 
question of whether tenure can be consistent with the maximization of research output. 
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aspects of the university environment. In particular, professors are 
risk neutral and are hired only to do research. Individual research 
output can be described by a single index that measures quality and 
quantity with the correct weights. In keeping with the professional 
nature of the job, it is assumed that hours of work and effort are not 
observable. We will also assume at this stage that there is only one 
university in the economy. 

The university wishes to hire good researchers away from their 
alternative occupations to allow them to do research. One of the con¬ 
straints it faces is that it takes time for any research project to be 
completed and, in particular, for its value to be ascertained. We will 
formalize this in a discrete-time framework by assuming that the deci¬ 
sion to hire or retain people is made at the beginning of the period 
and their output is realized at the end of the period. 

The basic framework involves overlapping generations of research- 
era. Faculty members have a working lifetime of three periods. Be¬ 
tween periods the oldest workers retire, some young workers may be 
hired, and some of the younger incumbents may quit or be fired. The 
university’s problem, in general terms, is to decide at the beginning of 
each period which workers to hire from its pool of applicants and how 
many, as well as which and how many of its incumbents should be let 
go. The university also has the option of saving its money to hire 
candidates in the future or of borrowing against these future posi¬ 
tions. In this section the university will have full information about 
the productivities of all incumbents and all potential candidates. 

The university expects to exist for infinitely many periods and has 
access to a perfect capital market. At the beginning of each period /, 
H, potential researchers apply for a job. Denote by a'l, h 6 H t — {1,2, 
S, ..., //,}, the ability of candidate h. This is defined as the output the 
worker will produce if he is hired for period t, and it is drawn from 
the distribution i|»(a). It is possible, therefore, to have “good years” 
and “bad years” for candidates. Denote by a‘ t the ability of incumbent 
worker = {1,2, 3,...,which is the output i will produce in 

period t if he is retained. Ability is not constant over the worker’s 
lifetime but is governed by the process a t+ 1 = a t + 4> /+ 1 , where <f> is 
distributed according to g(<t>), £(<}>) — 0. Thus when the university is 
not able to observe the true potential output of incumbents, the previ¬ 
ous period’s output will give an unbiased estimate. In general the 
superscript h will refer to professors who have not yet been hired or 
who have been rejected. The superscript i will be used for all incum¬ 
bents, including those who have just been hired. 

In a world in which the abilities of incumbents are constant and the 
same set of abilities of candidates is presented to the university each 
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period, it is not difficult to show that tenure for incumbents will be an 
outcome. The firm’s optimal strategy is to set a standard and hire or 
retain only those professors who can meet it. But if a worker can meet 
this standard in one period, he will meet it in every period since his 
ability is constant and the standard does not change from year to year. 
The major role of the structure introduced in the previous paragraph 
is to rule this case out in a relatively neutral way. 

If a candidate is not hired, he goes to an alternative that is worth V. 
In subsequent periods his research abilities decay, and he will never 
again apply for academic work. This simplifies the problem by ensur¬ 
ing that new hires are made only at the junior level. In general the 
present value of an incumbent’s alternative will fluctuate by random 
amounts from period to period to reflect changes (perhaps antici¬ 
pated) in the value of his time at home or in other jobs. The analysis is 
further simplified by the assumption that the alternatives of incum¬ 
bents will be stricdy less than V. Denote the present value of an incum¬ 
bent’s alternative in period s by V] E (V, VO C R + ; VJ is independent of 
ability, and for professors in their first period, V\ =* V. 

The university receives income in period s denoted by b s . The pres¬ 
ent value of its budget in period t is therefore given by 

00 

B, - S t + X (1) 

where S, is the level of assets carried into period t and 8 is the discount 
factor. The university discounts future research output at the same 
rate. Thus if a professor has been hired or retained at the beginning 
of period t, the expected present value of his research output is given 
by 

Ai = a} + 8K; +1 A! +i , (2) 

where R[ is the probability that incumbent i is retained into period s. 
Under full information E {0, 1}, and if i is in his third period in 
period s, then R\+ \ - A‘, + j = 0; A? is defined similarly. 

The value of working for the university for someone who has been 
hired or retained at the beginning of period t is given by 

W\ = wj + 8 [RUiW\ +l + (1 - K! + ,)(V; + , + PJ +t )l, (3) 

where io| is the wage paid at the end of period t and P }+1 is the present 
value of any pension the worker will receive if he is not retained into 
period t + 1. As before, if the professor is entering his third period of 
work in periods, then + ] = fj + i = WJ+i = 0. Denote by pi 

the installment on former worker fs pension in period s. The set of 
former workers in period s is denoted by F s . 
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The university’s problem at the beginning of period t under full 
information can now be written as 


fie 

maximize y 8 5- ‘ y a\ 

(4) 

subject to 

W\ & V\ for all i E I„ 

(5) 


OB 

jBF, / 

(6) 


Is C H s ~ 1 U I s —i> 

(7) 

and 

Is n H,-r = 0 , r Si 4. 

(8) 


This is a discrete programming problem, so one might suspect that 
the shadow price on the budget would not be well behaved. For ex¬ 
ample, if the university exists for only one period, professors cost 10 
dollars, and there is one dollar left over after all the best ones are 
hired, then the marginal value of another dollar is zero even though 
another nine dollars might be very valuable. In the current frame¬ 
work the infinite horizon of the firm provides an alternative use for 
this dollar. It can be saved until it has grown to 10 dollars (or more) 
and then used to hire another professor. Since this future candidate’s 
output is discounted at the same rate as the budget, the effect is just as 
if the dollar could be spent on a tenth of one professor now . 3 There is 
still the problem, however, that hiring or firing a worker will dis¬ 
cretely change the shadow price of the budget by an amount that 
depends on the cost of hiring him. 

We will assume that the university’s choices are constrained by its 
budget (If not, it is easy to see that it will hire or retain everybody for 
whom A* > 0.) Since tenure in practice is an explicit legal contract, we 
will make the assumption that the university can offer workers ex¬ 
plicit contracts. These will specify wage and firing rules as functions 
of observable variables such as the budget level, the abilities of incum¬ 
bents and candidates, and the workers’ own ability, alternative, and 
seniority. This means that the individual rationality constraint (5) may 
not be binding on workers in their second and third periods. How- 

3 Even if the university had a finite horizon, the same result could be obtained by 
giving it a contemporaneous alternative use for its funds. 
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ever, it is easy to show that the constraint will be binding for new hires 
and from there to characterize the solution to the program. 

Proposition 1. The solution to the program outlined in (4)—(8) is 
characterized, for all s, r 2s t, by (5) and (a) W\ = V for all i G /, H H ,_ j, 
(b) A\ S: XL V\ for all i 6 1„ and (c) for all h € H r -1 \ /„ where 

XL is the shadow price that would be attached to the budget constraint 
if incumbent i were Bred (or had not been hired), ceteris paribus, and 
X* is the shadow price that would be attached to the budget constraint 
if the candidate h were hired. 

Proof. It is useful to rewrite the program as the following Lagrang- 
ian to be maximized subject to (5), (7), and (8): 

2 8,- ‘ 2 < + - 2 8 '"‘ (2 + 2 w 

For part a of the proposition, by previous arguments, X is positive at 
an optimal solution. Assume that W\ > V for some i, s, i G I s D //,_ j. 
Then a marginal reduction in W\ will have no effect on that profes¬ 
sor’s output since he will still accept employment, but (9) will be in¬ 
creased by the reduction in the wage bill. For parts b and c, given a, 
the marginal effect on the wage bill of hiring junior worker t in period 
s is VJ — V. The cost of retaining incumbent i into his second period 
has two components: the direct cost given by W' - Pi and a benefit 
due to the fact that the university was able to offer him less income in 
the previous period. This benefit (using a again and eq. [3]) works out 
in period s dollars to P\ + V] - WJ. Thus the overall effect on the 
wage bill is V] as before. A similar argument establishes the same 
result for workers entering period 3. 

Hiring or retaining a worker reduces the resources available for 
hiring other workers and therefore increases the marginal value of 
the budget. The overall effect on (9) of hiring or retaining a worker j 
will therefore lie between M ~ XLVj and M — XL V{. Thus if at an 
optimal solution in any period s incumbent t is hired or retained, we 
know that A\ a XL Vi, and if in any period r candidate h is not hired, 
we know that A h T s X + V. Q.E.D. 

Clearly the actual value of the multipliers will depend on every 
aspect of the problem from the size of the budget to the distribution 
of available candidates in all periods. The character of the firm’s 
optimal hiring and firing policy is easy to determine, however. If the 
cost of hiring a worker becomes very small relative to the size of the 
budget, a continuous approximation to the problem can be used that 
will imply that 



for all i £ /„ h € H r -j / l r . 


( 10 ) 
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This states simply that at any full-information optimum, every incum¬ 
bent in any period has a higher output to alternative ratio than any 
unsuccessful candidate in that or any other period. In the current 
discrete context, this conclusion has to be modified slightly for some 
workers who are close to the margin. If two workers have similar 
output alternative ratios, the university may prefer the one with the 
lower alternative since the average cost per unit of output is lower. 
Note the difference from the usual condition that workers should be 
sent to their alternatives whenever PAi < VJ, where P, is the price of 
output. This is the major implication of the assumption that the out¬ 
put of professors is not sold on the market. 

Tenure as we know it is certainly not a part of the full-information 
optimum. Professors whose ability alternative ratios fall below those 
of potential replacements will be fired even if it is to save the money 
for replacements who will not appear for several periods. A particu¬ 
larly good year for candidates may lead to wholesale replacement of 
the weaker incumbents. Also, a ceteris paribus decrease in the budget 
will increase its shadow price and lead to higher standards for hiring 
and retention. Again the weakest and most expensive professors will 
be fired. 

As mentioned in the Introduction, perhaps the best real-world ex¬ 
ample of employment practices in which the firm observes the abilities 
of incumbents and candidates but also faces a fixed budget is that of a 
professional sports team. There the constraint is the maximum num¬ 
ber of players that may legally be kept on the roster (or on the playing 
field), but the parallel is clear. Players whose expected future output is 
low, regardless of their past accomplishments, regularly lose their 
positions so that better ones can take their place. Some of these play¬ 
ers have very poor alternatives, although they are still competent 
athletes. Of course, output is more readily observable in that context 
than in academics, but it is not essential to the present result that 
ability be observed exactly. If all the relevant variables are interpreted 
as the university's best estimates, then an expected output-max¬ 
imizing policy will satisfy the same conditions. Things will change, 
however, if the quality of the firm’s information depends on its hiring 
and firing policies. This complication is examined in the next section. 

II 

Consider now the problem examined above except that the university 
does not have full information about the abilities of incumbents or 
their alternatives and incumbents have better information than the 
university about the abilities of candidates. The formal setup is the 
same as in the previous section with the following changes. 
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At the beginning of period t the firm has observed the output of its 
incumbents in period t — 1 and knows this to be an unbiased signal of 
their abilities in period t. It does not have the ability to judge the 
talents of potential new hires. The incumbents at the firm do have this 
ability, but it is imperfect. Denote by 4** the information known to 
incumbent t about the ability of candidate h. It is assumed to be an 
n x 1 vector of real numbers. This information is informative about 
the candidate’s true ability. The firm knows this and within each 
period solicits its incumbent workers for their information. Each 
worker replies with a signal 4\ which is an nH t ~ i x 1 vector of infor¬ 
mation, with n elements for each candidate. The firm amalgamates 

this information optimally in the function G(4\ 4 2 .4 /l ' 1 ) to form 

an estimate of each candidate’s ability. 

In general, incumbents will have opinions about the expected fu¬ 
ture output of candidates as well as direct information such as their 
responses to questions in interviews. These opinions will be in¬ 
fluenced independendy by their priors over the true information of 
other incumbents. Since the administration can do no better than 
when it has the true information and since the diversity of informa¬ 
tion about a candidate may also be informative, it will be important 
that each incumbent report his own information rather than his opin¬ 
ion. This also avoids the social choice problems that might arise if 
incumbents genuinely wanted to hire the best candidates but did not 
trust each other’s opinions. We assume here that the university con¬ 
sults with all its department members, and later we will consider some 
of the consequences of relaxing this. 

Another important assumption is that the university does not know 
the alternatives of its incumbents. This is critical since otherwise the 
university could compensate workers who get fired with pensions that 
make them indifferent to their employment status. Then as long as 
the overall returns of incumbents are nondecreasing in the revealed 
abilities of the people they hire, they will be happy to hire the best 
ones even if it means they will be fired. The way in which this assump¬ 
tion enters the analysis will be apparent shortly. 4 

Incumbents during period t - 1 know the previous period’s output 
of all incumbents and have their information about the abilities of 
new hires. They may know more than the firm about the ability of 
other incumbents, but there remains some uncertainty in their minds 
about how they will rank once outputs are revealed. They also may 
have some information as to the opinions of other incumbents about 

* In a model with more than one university, alternative wages might well be known. 
However, as long as there is some idiosyncratic “location preference” that is not known, 
ail the results hold. 
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the abilities of candidates. They decide what signal to send to the firm 
on die basis of this information, the wage and firing rules, and their 
knowledge that the firm will use what they reveal to try to maximize 
the research output of the department. The firm’s problem at the 
beginning of period t is therefore given by 

oo 

maximize X b s ~‘ X Eti) (11) 

Ji. 1 .... fz?, fez 


subject to (5), (6), (7), and (8) as well as 

FM) « a*_, for all i E A- |fl(fl s . 2 U H s . s ), (12) 

E(a\) = (7(4*, 4 2 , . . . , ft 7 * ') for all i £ (13) 


and the incentive-compatibility constraint 

£ (W;_ ,)(*' = a 1 ) > £(WJ_ ,)(!•' * a*). (14) 

Note from (3) that (14) can be satisfied only if the worker’s expecta¬ 
tion of his alternative in future periods is also known. If we assume 
that there is no language rich enough for the incumbent to communi¬ 
cate his alternative (in practice the vocabulary seems limited to “I 
quit” or “I’ll stay”), then the main result of the paper is easy to prove. 
A more general result is available, however, if we assume that there 
are no language constraints but require that the worker reveal truth¬ 
fully the value of his alternative so that the university can adjust his 
compensation and ensure that (14) is satisfied. This gives us the addi¬ 
tional constraints 


and 


£(W{_!)<?•; = Vi) 2 £(w;-,)M * v\) 

for all i £ / f _i H (H t - 2 U //,_ 3 ) 

£(W’_ i)(V‘ + i « vj +1 )2£(W}_,)(^ +l * Vi +l ) 
for all i £ /(_i n 


(15) 

(16) 


When (14) can be satisfied, we are justified in using the incumbents’ 
true information in (13). 

The administration is inducing the members of its department to 
play a direct revelation game and wants the outcome of this game to 
be truth telling. The administration knows i|/(a), but it does not know 
the quality of this year’s sample. Incumbents, by their own informa¬ 
tion and by informal communication with each other, may know more 
than this. We will therefore assume that while incumbents play Bayes¬ 
ian Nash strategies, truth telling must be implemented independent 
of the priors each incumbent may have over the information of his 
colleagues. The intent of this assumption is to capture a situation in 
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which the administration is very much in the dark about the internal 
politics and communication possibilities within its department. It tries 
to design reward functions that are robust in that they will induce 
truth telling in a great many possible environments and, in particular, 
will not need fine-tuning from year to year. The implementation 
concept is weaker than dominant strategies only in that in equilibrium 
each member of the department assumes that all other members will 
be telling the truth. 

One final remark is in order before we can begin. It will be shown 
presently that the incentive-compatibility constraint may be binding 
on the outcome so that output is reduced from the first-best level. 
Since the firm finds out about productivities anyway, it has the alter¬ 
native strategy of hiring people at random and later on firing those 
who do not reach a given standard. This strategy will be particularly 
attractive if the length of time it takes for output to be revealed is 
short. Indeed, this is a common explanation for the use of “proba¬ 
tionary periods” during which the performance of new workers is 
closely monitored. We assume here that a key characteristic of aca¬ 
demic employment is the length of time it would take the administra¬ 
tion to discover the ability of new hires. It therefore finds it more 
efficient to run a revelation mechanism. 

It seems clear that if the university plans to follow an optimal hiring 
and firing policy, it may have some problems getting its incumbents to 
identify the best candidates. Since incumbents are unsure at the time 
they reveal their opinions of their own future abilities, the abilities of 
other incumbents, and the actual abilities of candidates, under an 
optimal policy they cannot rule out the possibility that they will be 
asked to leave at some time in the future in order to make room for a 
candidate. Some necessary conditions on their expected reward func¬ 
tions are given in the next proposition. 

Proposition 2 . (a) Any differentiable expected reward function 
that gives incumbents the incentives to reveal the expected value of 
their alternatives in all future periods and to reveal truthfully their 
information about candidates in Bayesian Nash strategies must en¬ 
sure that the probability that an incumbent is retained in any future 
period is independent of his current signal about the abilities of can¬ 
didates, given his true information. ( b) If implementation is to be 
achieved in priors-independent Bayesian Nash strategies, an incum¬ 
bent’s expected rewards must be nondecreasing in his current expec¬ 
tation of the revealed abilities of the candidates who get hired. 

Proof. We will establish condition a for the current period and then 
use an induction argument to extend it to future periods. Partial 
derivatives indicate gradients. We have from (14) and (15), suppress¬ 
ing the superscript * and time subscripts where possible, 
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aEW'^t 


0 for all ft, V, ft - ft, P - V, 


(17) 


and, since the firm receives no objective information about V, 


dEW t _! 


0 for all ft, ft = ft, for all V 


(18) 


and the associated second-order conditions. Since (17) holds for all V, 
V = V, we have 


d*EW d 2 EW = 0 

ava* avaft 


for all ft, V, ft = ft, * V. 


(19) 


Since (18) holds for all we have 


d*EW 
dV 2 


= 0 


for all V. 


( 20 ) 


The second-order conditions therefore require that the second term 
in (19) be zero. Reversing the order of differentiation of the first 
term, we get 


a 2 £W t _, dR, 

aft ( -,av, aft,,. 


( 21 ) 


This result can be extended to subsequent periods without 
difficulty. Using the same method of proof with (14) and (16), we get 


d 2 EW t - , 
aft<- idV, + ( 


~-[8 2 £,(l - £ t+I )] = 0 
aa,_ 1 


( 22 ) 


so that, using (21), we get, in general, 

— = 0 , j - 0 , 1. (23) 

d»t- 1 


For part b we note that 

dEW ,.1 = dEW ] dEW aa 

aft,_! aft . a« aft 


(24) 


where a is the incumbent’s expectation of the future revealed abilities 
of the candidates who are hired. The notation da/dft indicates a matrix 
with rows that are gradients of the elements of a. The second-order 
condition associated with the problem is 



d 2 EW l da \ 2 

~£r~ T*r) 

, dEW a 2 a 
aa ' aft 2 


+ ( 2 ) 
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a 2 £W 

aftda 


da 

aft 


( 25 ) 
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Since the imcurabent believes that if he reveals truthfully the best 
candidates will be hired, we have 

— - 0, -^2- s 0. (26) 

dA Ir 

The relationship between a worker’s information i and his opinion a 
depends on his beliefs about other incumbents’ information. Since 
revelation of i must be achieved independent of these beliefs, the 
firm will not be able to use its knowledge of the actual output of a new 
professor to infer anything about the true information of the incum¬ 
bents at the time he was hired. It may be able to infer something about 
their opinions, but many different information vectors will be consis¬ 
tent with the same opinion. Thus we must have 

- 0 for all A, (27) 


so that 


d 2 EW 


dA* 


= 0 , 


and we get from (25), (26), and (27) 


dEW 

da 


> 0 . 


(28) 


(29) 


Q.E.D. 

In a model in which workers live more than three periods, it is clear 
that a simple inductive argument will extend (23) to an incumbent’s 
remaining working lifetime. The result means simply that the univer¬ 
sity, if it expects its incumbents to tell it who the good candidates are, 
cannot make room for good candidates by firing incumbents at the 
time a candidate is hired (the end of the current period) or by plan¬ 
ning to fire them in some future period and borrowing against these 
future savings. 

The intuition behind this result is quite straightforward. If an in¬ 
cumbent is to tell the truth about candidates, he must be indifferent to 
marginal changes in the signal he sends. If his signal affects the prob¬ 
ability that he is retained, then he must be compensated by an amount 
that depends on his alternative. But if his alternative is unknown, 
then in order to get him to reveal it truthfully the overall transfer he 
receives from the university must not depend on his revealed alterna¬ 
tive. Thus his compensation cannot depend on his alternative. The 
only remaining possibility is for the incumbent’s signal about candi¬ 
dates to have no effect on his probability of retention. 
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Note that in practice the worker’s alternative will not have to be 
revealed. When (23) is satisfied, the firm need not know the worker’s 
alternatives to ensure that (14) is satisfied. Thus the worker in each 
period need only reveal his opinions about candidates. This is impor¬ 
tant since it is clear that the kind of revelation that would be necessary 
to implement the contract if this were not true is a very detailed and 
explicit signal as to the actual value of the worker’s alternative. Reve¬ 
lation of alternatives in practice seems to be limited to signals of an 
intention to quit. 

This result (part a) uses only the restriction that incumbents must 
prefer telling the truth to marginal deviations from the truth. When 
all signals are truthful, all incumbents expect the best candidates to be 
hired, so that any deviation in either direction will reduce expected 
output. This means that this result actually puts no restrictions on the 
effect of the revealed abilities of candidates on the wages and reten¬ 
tion probabilities of those who hire them. Part b, however, gives us the 
intuitive restriction that hiring good candidates cannot make incum¬ 
bents any worse off. This is like a piece rate for truthful revelation 
and need only be nonnegative. Independence of priors and the as¬ 
sumption that diversity of information is important rule out schemes 
in which members are rewarded according to how accurately they can 
predict the actual abilities of candidates. 5 

The assumption of a fixed budget was not used anywhere in the 
proof, so the result applies even in a standard model. However, it is 
only when there is an overall budget constraint that the restriction will 
prevent the attainment of the first-best outcome. If workers’ outputs 
are sold on a market, then each can be paid a piece rate based only on 
the value of his own output. Each will separate if this value falls below 
that of his alternative. The abilities of other workers will have no 
effect on this separation decision. 

When the university’s budget is fixed, proposition 1 tells us that the 
optimal firing rule will depend on the expected abilities of available 
replacements. Proposition 2 explicitly rules out any such dependence. 
Thus the first-best is not attainable. Note that the firm is free to hire 
the best candidates it can find, and it is not directly restricted in the 
numbers it hires. It can borrow against future hires in good years and 


5 Constraints on the language with which incumbents can reveal their information 
about candidates might also be sufficient to prevent this. The result that incumbents 
can be no worse off if they hire better candidates can also be obtained even when 
diversity of information is not important. The reason is that the abilities of candidates 
who are not hired are never revealed. Thus the incumbents can collude to reveal 
truthfully about a set of mediocre candidates and give terrible ratings to some good 
candidates, and the administration will be in the dark. Condition (29) is sufficient to 
prevent this. 
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save positions in bad years. However, it may not take advantage of a 
good year for candidates by firing some of its less productive incum¬ 
bents (even though they are easily identifiable) and it may not re¬ 
spond to a bad year by firing fewer than would have been fired other¬ 
wise. 

This seems to be the limit of the restrictions the present analysis can 
place on academic contracts. 6 Although we have ruled out a great 
deal, including the first-best, what is left over is not a unique contract. 
However, we have done enough to gain some insights into some cur¬ 
rent practices. The rest of this section contains a more informal dis¬ 
cussion of the institutions observed in academics and argues that they 
are consistent with (23) and (29). We will consider first situations in 
which the budget is expected to remain relatively constant and later 
on situations in which “financial crises," perhaps including borrowing 
constraints, are possible. Define as “feasible" any practice that is con¬ 
sistent with (23) and (29). 

The first point is that if the university is willing to give complete 
authority in hiring decisions to those workers who are facing manda¬ 
tory retirement at the end of the current period, then it can give these 
workers a fixed income and pension and follow an optimal hiring 
rule. Since these workers know they will not be retained next period, 
(23) and (29) are automatically satisfied. Of course the university 
might have trouble if in previous periods all its older candidates had 
been let go. It seems clear, however, that there are gains to obtaining 
the opinions of a larger set of incumbents and in particular those of 
the younger ones. In practice, of course, younger incumbents do have 
input into hiring decisions. 

There is nothing in any of the previous analysis to suggest that the 
university can never fire people. One of the simplest feasible ways to 
do this is to set an exogenous output standard that all incumbents 
must achieve. In this case the present value of the worker’s salary 
depends only on the probability that his own output falls below this 
standard. This policy has good incentive properties but involves some 
misallocation since the university cannot adjust the standard to cur¬ 
rent market conditions. It does seem to correspond to the firing prac¬ 
tices of most universities, although the actual standard is set very low. 

6 When the budget is fixed and exhausted each period, it may seem that nothing 
other than guaranteed employment and fixed wages for everyone can be consistent 
with (23) and (29). In fact, it is possible to reward ail incumbents but the most recently 
hired one at the margin for increases in their own output and for increases in the 
output of the people they have hired. To balance the budget, incumbents will have to 
be made worse off by improvements in the performance of their more senior col¬ 
leagues. In practice, the use of “anomalies funds” and other extraordinary sources of 
income seems sufficient to allow deserving professors to be rewarded without explicit 
cuts in the salaries of others. 
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Short of moral turpitude or refusal to teach classes, job security is in 
most cases assured. 

There are perhaps some reasons for this low firing standard. First, 
the standard must be absolutely exogenous. If there is any possibility 
that by revealing that the current candidates are particularly bad or 
by stocking the university with low-quality professors the incumbents 
can induce the firm to lower the standard, the result fails. Of course in 
bad years it may be in the firm's own ex post interest to lower the 
standard since otherwise it will have to reduce employment or hire 
people worse than the ones it is firing. 

More important, the distribution of abilities of candidates will 
surely change over time. Unless these changes can be anticipated at 
the time that the exogenous firing standard is set, so that an accurate 
time trend can be introduced, errors will accumulate and the firing 
rule may become severely out of date. Since the university’s knowl¬ 
edge of the distribution of the abilities of candidates depends ulti¬ 
mately on information provided by the incumbents (both through 
what they reveal and by the revealed performance of the people they 
hire), incumbents would have to be “grandfathered” through any 
anticipated change in the standard. 

It is also possible to fire people if they are unproductive relative to 
their colleagues if this can be done consistently with (29). This is 
perhaps the rationale behind contract “buy-outs,” in which particu¬ 
larly unproductive professors are rewarded by being bribed to leave. 
Buy-outs are cosdy because of the bargaining necessary to ensure that 
the separation is voluntary (since the incumbent’s alternative is not 
public knowledge). They are also rare, perhaps for this reason as well 
as for the reason that if they were more common, incumbents with 
relatively good alternatives might purposely reduce their output in 
order to get themselves bought out. 

Another possibility is to compare incumbents with standards that 
do not depend on the revealed abilities of the incumbents that they 
had a hand in hiring. There is no danger in comparing workers 
within a given cohort since they never had the opportunity to reveal 
information about each other, and it is also feasible to compare work¬ 
ers with the standards set by more senior workers. This also seems 
consistent with current practice: incumbents are reviewed after a 
specified period, and those who do not meet with the approval of 
their senior colleagues are involuntarily released. 

The present analysis leads to two remarks about this practice. First, 
if the opinions of the least senior workers (i.e., the untenured ones) 
about candidates are especially important since their training is the 
most current, it will be necessary to make the outcome of their own 
tenure decisions independent of the abilities of the current candidates 
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and the less senior untenured workers. A tournament among young 
workers for a fixed number of tenured positions is not feasible, but 
hiring candidates for tenure-track positions will be. Of course, if the 
workers who have passed this test already are the only ones whose 
opinions matter, then there are no restrictions on the way untenured 
faculty may be treated. 

Second, there is nothing within this model to suggest that the ten¬ 
ure decision be. made only once during a worker’s career. It could be 
done every period as long as the standards used are consistent with 
(23) and (29). For example, if the quality of candidates has increased 
over time, the standards used for older incumbents would have to be 
lower than those used for younger ones. 7 

Early retirement is another way to get rid of incumbents. If the 
workers who retire are to be chosen by the firm, then these will be 
considered buy-outs and will be subject to the same problems. Costly 
ex post bargaining seems to be the only way to determine the amount 
of the pension or cash settlement. However, voluntary early retire¬ 
ment for a subset of the employees determined by some criterion 
other than relative ability (i.e., age, years of service, or the size of the 
budget) is perfectly feasible. Under this plan, all the employees in 
the subset are given a take-it-or-leave-it offer. Although some of the 
members will be only too happy to accept, they will not be able to 
affect the size of the offer or their membership in the group to which 
it is offered by what they reveal about candidates. While it might be 
tempting to offer early retirement to those workers whose outputs fall 
below some budget-determined output standard, this is also open to 
manipulation. As mentioned earlier (and although it has not been 
explicitly modeled), workers with high alternatives might purposely 
reduce their output in order to become eligible for early retirement. 

In recent years several universities in the United States and Canada 
have faced severe budget cutbacks combined with borrowing con¬ 
straints that have forced them to eliminate positions that would not 
normally have been lost. In a world in which incumbents anticipate 
lie possibility of these crunches and the inevitability of involuntary 
separations should they occur, the policies the universities follow can 
be very important. In particular, it is sometimes argued by younger 

7 In practice, it might not be very helpful to have subsequent reviews since the quality 
of candidates can have no effect on the outcomes. Another problem (outside this 
model) has been suggested to me by a university administrator. If all contracts were of 
limited term, subject to renewal, it would be difficult to get incumbents to take the job 
of evaluating their younger colleagues seriously. Personal evaluations are very difficult, 
and the temptation would always be to renew. The fact that the members of the 
department must "live with their mistakes" induces them to devote considerable care to 
the initial decision. In practice, in many cases departments are not allowed to renew 
term contracts, short of giving the person a tenure-track appointment. 
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academics that the best policy is to fire on the basis of ability only. 
Such a rule is inconsistent with (29). Incumbents will try to stock the 
university with poor-quality professors to reduce the chances that 
they will be the ones to be let go. Pure seniority rules are feasible if the 
cutback specifies a maximum number of positions that can be saved. 
Across-the-board wage cuts and random layoff rules are also feasible, 
but these seem to have little to recommend them other than simplic¬ 
ity. One interesting alternative is the elimination of entire depart¬ 
ments. This practice has very good incentive properties since mem¬ 
bers of one department do not choose the new hires of another. In 
fact under this regime, incumbents should be doing whatever they 
can to increase the average quality of their departments. This idea has 
been implemented in several cases. 8 

There is also some evidence from a rather different context that is 
consistent with the model. The hiring practices followed by symphony 
orchestras in North America and Europe are somewhat different. In 
North America, hiring decisions are typically made by a musical di¬ 
rector in consultation with a board of advisors and the regular con¬ 
ductor, if any. Members of the orchestras are unionized but can be 
replaced if their playing is not up to standard. In Europe, there are 
some symphonies, in particular the Berlin Philharmonic, in which any 
member of the orchestra may attend a candidate’s audition and final 
decisions are made by a vote of the entire orchestra. Voting members 
of the Berlin Philharmonic have tenure. Since there seem to be no 
other relevant differences between this orchestra and those in North 
America, this is powerful support for the model. 9 

The use of a direct revelation game to analyze the problem makes 
the results quite general by virtue of the revelation principle (see 
Myerson 1979; Harris and Townsend 1981). In another paper (Car¬ 
michael 1986) I have examined the more general problem of induc¬ 
ing members of an organization to make decisions that are in the 
interests of the organization as a whole. The restrictions obtained are 
very similar. Personnel decisions, that is, those that affect the position 
or status of workers in the organization, cannot be made by the work¬ 
ers themselves if the preferences of workers over various assignments 
are private information. This may explain the existence of personnel 
departments, whose members make decisions about the status of 
other workers in the organization without personally being affected. 
This is in spite of the fact that the workers themselves, or their co¬ 
workers, have the best information about their relative qualifications. 

8 In particular, at the University of Michigan and the University of British Columbia. 

9 1 have searched extensively for written confirmation of this information, but with¬ 
out success. This paragraph is based on conversations with several musicians and mem¬ 
bers of the Queen's University music department. 
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Two recent papers by Lazear (1986) and Milgrotn and Roberts (1987) 
also emphasize the importance of making workers nearly indifferent 
to or approve of decisions that are good for the whole team. 

III. Conclusions 

Universities in Canada and Great Britain and some of the publicly 
funded universities in the United States are faced with declining real 
budgets, while at the same time the quality of candidates, if anything, 
is increasing. This has led in some quarters to a call for the abolition 
of tenure so that less productive older faculty can be let go to make 
room for younger ones and to prevent the creation of a “lost genera¬ 
tion” of young scholars. 10 This paper suggests that such a policy 
would have disastrous results. The most brilliant of the young schol¬ 
ars would actually And their job prospects reduced, unless the admin¬ 
istrations began to devote extensive resources to the screening of 
applicants. The counterproposal of providing generous voluntary 
early retirement plans, rather than a self-serving suggestion from lazy 
academics, is seen here as a sensible solution to an important incentive 
problem. 

The model has also been able to make some predictions about the 
institutions one should expect to observe being used in academics. In 
particular, we have been able to rule out as infeasible the first-best 
practice of firing the weakest incumbents, either to replace them on 
an ongoing basis or to save money in times of financial crisis, even 
though they may be readily identifiable and clearly overpaid. Some of 
the common practices of universities were shown to be consistent with 
the model. These include exogenous but lenient universal perfor¬ 
mance standards and periodic reviews in which incumbents are com¬ 
pared with the standards set by others in the same cohort and the 
more senior workers. In times of financial crisis, early retirement 
plans and elimination of entire departments are feasible, with the 
latter one perhaps to be recommended, particularly if there are no 
unusually able people in the worst departments. 

Perhaps the most important extension of the model will be to the 
case in which there is more than one university in the economy. In 
this context an “outside offer” will be a particularly good signal of 
ability since it is information about the opinions of incumbents in 
another department. There is also the possibility that the administra¬ 
tion could circumvent its own employees and hire on the basis of 

10 This argument is set forth in The Great Brain Robbery, by Robert Rothwell (To¬ 
ronto), Jack Granatstein (York), and David Bercuscn (Calgary). This "polemic” is refer¬ 
enced by Polanyi (1984, p. 9). 



47* journal or. oui ical economy 

outside references only. However, the interpretation of letters of ref¬ 
erence is also a task that is perhaps best left to the experts in one’s own 
departments. Also (although this requires study), if other depart¬ 
ments are interested in keeping the best people because their admin¬ 
istrations have provided the right incentives, they may not be willing 
to reveal their true opinions to other universities. 

It is also clear that some of these ideas extend in a modified form to 
other markets. The basic problem is to ensure that the members of an 
organization have the incentive to make the right decision, assuming 
they have already collected all the relevant information. This paper 
has shown that the problem may not be trivial, in spite of the common 
assumption in team theory that the members of an organization share 
its goals. In fact the basic message of this paper is that in order to 
ensure that the members of the organization share its goals, the form 
of the organization itself is restricted in empirically relevant ways. 
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. . Nonuniform Pricing Model of Union Wages 
and Employment 
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Unlike implicit contracts models, the nonuniform pricing model of 
unions assumes that firms can always shut down ex post to avoid any 
payments to the union. Under this restriction, employment can dif¬ 
fer from a first-best even if both workers and firms are risk neutral. 
In general, the union chooses to offer quantity discounts on labor 
and needs to use a seniority rule that regulates the order in which 
workers are hired to implement these discounts. Unions lower (al¬ 
most) all workers' employment probabilities and increase the cy¬ 
clical volatility of employment, and the union-nonunion average 
wage differential will move countercyclical^. Workers’ preferences 
over union wage profiles, conditional on their seniority, exhibit 
(within limits) a convenient “unanimity” property. 


Introduction 

t is now well known that a product market monopolist can discrimi- 
late among consumers with private information by implementing a 
tonuniform price schedule (Spence 1977; Mussa and Rosen 1978; 
Goldman, Leland, and Sibley 1984; Maskin and Riley 1984). An op- 
:imal nonuniform price schedule yields higher profits than a constant 
arice per unit and typically involves quantity discounts (Spence 1977, 
.p.11-13). 
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Despite the simple aftd powerful results of the nonuniform pricing 
model, it has apparently never been applied to factor market monop¬ 
olists such as trade unions. Models of employment contracts with 
asymmetric information (Green and Kahn 1983; Hart 1983; Rosen 
1985) come close to doing so but differ from the nonuniform pricing 
model by implicitly dropping one of its key assumptions. This is the 
notion that the seller (union) cannot impose unlimited lump-sum 
taxes on the buyers (the firm in different states) (Spence 1980, p. 
822). Instead, contract models typically introduce assumptions such 
as risk-averse firms and “income effects” stemming from incomplete 
capital markets for workers, to generate nontrivial results. 1 The 
usefulness and realism of these assumptions (to which the model is 
very sensitive) have recently been strongly questioned (Topel and 
Welch 1986). 

This paper develops a model of trade union wages and employ¬ 
ment that is based on nonuniform pricing theory. It thus imposes the 
restriction that workers, as a group, cannot impose any charges on the 
firm if it hires no one. This can cause employment to deviate from 
the competitive level even in the absence of employer risk aversion or 
income effects. It also allows us to generate a number of interesting 
and novel predictions regarding union wage and employment policy, 
including the following. 

First, under fairly general conditions, implementation of a union’s 
optimal nonuniform pricing policy for labor in any single period 
involves the use of a “seniority system” with the following properties: 
(i) each worker in the union is assigned a “seniority index,” which 
determines his position in a queue governing the order of layoffs and 
hires; (ii) workers with more firm-specific skills are assigned the more 
desirable positions in this queue (i.e., greater priority in hiring); and 
(ili) wages vary with seniority in such a way that the markup charged 
by the union on labor is higher for those workers with the lowest 
probabilities of a layoff. Thus the present paper offers a formal ex¬ 
planation of why unions might wish to impose seniority systems on 
firms. 2 

1 Hall and Lilien (1979) have shown that a first-best allocation can be achieved in 
implicit contract models despite the presence of asymmetric information when firms 
are risk neutral and workers' preferences for leisure exhibit no income effects. The 
model’s sensitivity to assumptions about risk aversion and income effects is discussed in 
Cooper (1983). 

* Carmichael (1983ft) provides a rationale for nonunion seniority systems, while 
Tracy (1986) offers an alternative explanation of union seniority. The current paper is 
not inconsistent with the use of seniority as a criterion for layoffs and rehircs in non¬ 
union firms but does predict a role for seniority rules in union firms that does not arise 
elsewhere. Evidence (e.g„ Freeman and Medoff 1984, pp. 123-26) that seniority crite¬ 
ria for layoffs and rehires are more important in union than nonunion firms seems to 
support this view. 
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Second, the model predicts the following about union-nonunion 
differences in wage and employment patterns: (i) unionism lowers the 
equilibrium employment probability of (almost) every worker in the 
firm, relative to a nonunion worker with the same level of seniority; 
(ii) the level of employment is more cyclically volatile in union than in 
nonunion firms; and (iii) the union-nonunion average wage differen¬ 
tial is predicted to move countercyclically. 

Third, the nonuniform pricing model offers an interesting solution 
to certain analytical problems that have arisen in previous models of 
union politics. These models typically either (i) simply assume a 
union utility function without deriving it from members’ preferences 
(Brown and Ashenfelter 1986; MaCurdy and Pencavel 1986) or 
(ii) make some assumption that simplifies the political problem but 
that is, unfortunately, counterfactual. Such assumptions include ran¬ 
dom layoffs or complete work sharing (McDonald and Solow 1981) 
or, alternatively, equal wages for all union members in the presence 
of an exogenously imposed seniority rule for layoffs (Grossman 1983; 
Blair and Crawford 1984). 

The nonuniform pricing model presented here avoids both of 
these pitfalls: Microfoundations are provided by deriving a union 
maximand from utility functions of all the members. Heterogeneity 
in workers’ preferences is induced by an endogenous seniority system, 
and wages are allowed to vary with seniority. Indeed, the current 
model shows, somewhat surprisingly, that the introduction of differ¬ 
ent wages for different workers into a seniority model such as Gross- 
man’s (1983), rather than complicating matters, can greatly simplify 
the union’s political problem. This is because, subject to certain re¬ 
strictions, every worker’s most preferred seniority wage profile for the 
union will be the same. Within limits, the optimal union wage policy 
could thus be unanimously approved by all members, and a union can 
be operated with a rather minimal need for internal political decision 
making. 

The paper is organized as follows. In Sections 11 and III, I analyze a 
relatively simple case, referred to as the “basic model.” This model 
considers a union with a fixed, inherited stock of equally productive 
members (who may, however, differ in how long they have been in 
the firm to date). The union in the basic model is viewed as deciding 
what wage policy to adopt for a single period, under the assumption 
that this decision has no impact on its opportunity set in any future 
periods. In this model, I show that it is almost always in the union’s 
interest to regulate the order in which workers are hired by assigning 
each worker a rank in a seniority queue. There is, however, no clear 
rationale—except from the fact that the union may “care” more about 
older workers’ utilities—in this simple context for assigning seniority 
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ranks according to workers’ years of service, as opposed to any other 
criterion. 

Section IV of the paper considers the effects of relaxing the main 
assumptions of the basic model. Among other things, these exten¬ 
sions examine the optimal assignment of seniority ranks to workers in 
a richer environment, suggesting reasons for assigning the best jobs to 
workers with the most years of service, which seems to be the case in 
reality. The paper concludes in Section V with a summary and a brief 
discussion of some issues concerning the empirical testing of the 
model. 

II. The Basic Model: Structure 

Imagine a risk-neutral firm operating for a single period and em¬ 
ploying a single variable input, labor. Its production function is given 
by F(L), where L E R+ is the amount of labor hired, measured in 
efficiency units. Define f(L) * F‘(L), and assume that /is decreasing 
and continuous. 3 Throughout the paper, it will be useful to index 
units of labor or human capital by /, that is, to use L ~ //_ 0 d/. Thus 
one can think of/(/) as the marginal product of the /th unit of human 
capital hired and, analogously, ol'w(l) as the marginal cost to the firm 
of the Itb. unit of human capital. 

It is convenient to think of this firm as purely competitive in prod¬ 
uct markets but as earning some rents to a fixed factor (say, entrepre¬ 
neurial ability) that a union seeks to extract. The output price, 0, 
faced by the firm is unknown at the beginning of the period (ex ante), 
when the nonuniform pricing schedule is negotiated. Ex post, after 
the wage schedule is established, 8 is revealed to the firm only and 
production occurs. Let 8 be distributed on [0, 0], with 8 > 0 > 0. 
Denote the density of 8 by m(8), its right-hand cumulative by M (0) = 
f* m9 m(u)du, and assume that m(-) is continuous. 

Throughout the paper, I assume that the firm is able to take two 
kinds of actions to maximize profits, both of them ex post. These are 
(1) choosing the level of total labor input L and (2) if not restricted in 
its choice by union rules, choosing the composition of L, that is, which 
workers are to be employed, and for how many hours each. 4 

The union facing this firm is a continuum of members, each 
identified by a value of z 6 (0, ZJ. The interval (0, Z] is assumed to 
contain all the workers the union "cares” about (in the sense that they 
have a positive “weight” in the union’s welfare function; see below), 

* Differentiability of f{L) is not required in the proofs of the paper's main results. 

* In what follows, 1 show that it is always in the union’s interest to control which 
workers are employed in each state, as well as the level of hours for each employed 
worker. As a result, situations in which firms control the composition of L ex post will 
never actually arise in this paper. 
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who may be fewer or more in number than the highest number of 
workers ever employed in this firm. It is convenient to assume, with¬ 
out loss of generality, that Z exceeds the highest employment level 
ever chosen by the firm. 

Union members are risk neutral, and all have the same utility func¬ 
tion: 


V(o>, e) = to + 55(1 - e ), (1) 

where a> is a worker’s total compensation from employment, e is hours 
worked, or “effort,” and 55 is a parameter measuring the value of 
leisure or of alternative employment, assumed to be the same for all 
workers. Worker z's level of firm-specific skill as well as his labor 
supply “technology” are summarized by the function h(e, z), which 
gives the total amount of human capital supplied by worker z when 
working e hours. In the basic model I shall assume that hours per 
employed worker are fixed (thus e 6 {0, 1}) and that all workers are 
equally able; that is, h(e, z) is independent of z and, without loss of 
generality, h(\, z) = 1. 

Since a worker’s compensation may depend in general both on his 
identity and on the state of nature, his expected utility can be written 
as 

U(z) = f [<d(z, 6) - 5*(z, 8))tn(0)rf8. (2) 

Je-e 

The union is assumed to maximize the welfare function 

f A[ z, U(z))dz, (3) 

where A 2 a 0 and A 22 s 0. 5 Of course, this objective function admits a 
number of interesting special cases, 6 * 8 one of which will serve as a 
useful reference point below. This is the case in which A 12 = A 22 = 0, 
an objective one would expect the union to adopt if it were indifferent 
to distribution or, contrary to the "property-rights" assumption de¬ 
scribed below, if it had unlimited access to lump-sum interworker 
transfers that would serve to “linearize" its utility-possibility frontier. 
When A !2 = A 2 2 = 0, the union maximizes expected total rents 
extracted from the firm. 

What tools are available to the union in trying to maximize (3)? In 
the basic model I consider two instruments, both of which are func¬ 
tions chosen before 0 is realized. The first is a wage schedule, a>(z), 

5 Loosely speaking, one can think of this maximand as representing the outcome of a 

cooperative game among the set of union members: by maximizing (3) I guarantee that 

the outcome of this game is on the utility-possibility frontier for union members. 

8 One example is a "boss-dominated” union, in which case A 2 = 0 for all members 
but one (who may or may not ever be employed). 
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specifying the amount charged the firm for the right to employ each 
worker, z? The second is a seniority assignment rule n(z), which is a 
one-to-one mapping of (0, Z] into itself, with the following property: 
whenever the firm employs a worker with a certain value of n, say n\ 
all workers with n < n' must be employed also. Given this system, it is 
natural to refer to workers with low values of the seniority index as 
senior workers; thus a worker’s seniority index, n, will be inversely 
related to his level of seniority, in common parlance. 

Because workers are identical and hours are fixed in the basic 
model, the worker with seniority index n corresponds exactly with the 
Ith unit of labor employed by the firm (i.e., n = /). Thus, throughout 
Section Ill, the index n is suppressed, with l(z) denoting the seniority 
assignment rule. Sections I VC and D reintroduce n in order to analyze 
cases in which workers differ and hours are variable. 

The rationale for considering a seniority rule is most easily seen by 
considering the kinds of outlay schedules the union can enforce. 
Without a seniority rule, the firm’s total outlays on labor, W(L), will 
always increase at a nondecreasing rate with L. This is because, 
whenever workers are priced differently, the firm always hires the 
cheapest ones first. With a seniority rule, however, the union can 
impose a general outlay schedule W(L) - Jf „ 0 w(z(l))dl on the firm, 
allowing us to use the tools of nonuniform pricing theory. The only 
restriction on this schedule is that, because there is no worker with / = 
0, W(0) = 0. This technique also allows us to check whether the 
seniority rule has any value to the union simply by ascertaining 
whether the optimal overall outlay schedule involves quantity dis¬ 
counts anywhere in its domain. 

What restrictions does the union face in attempting to maximize (3) 
via an appropriate w(l) and l(z)? In addition to the constraints already 
mentioned—that the firm maximizes profits ex post (i.e., asymmetric 
information) and that W(0) = 0—1 impose two additional constraints 
in the basic model. One of these is individual rationality for all union 
members. The other is the restriction that no transfers can be made 
between union members. 7 8 For this assumption to be meaningful, I 
need to specify an initial allocation of property rights to wage pay- 


7 The fact that the union chooses such a schedule effectively implies that wages are 
prices the union attaches to the services of individual workers (as opposed to being 
prices charged for tasks performed, e.g.) and is crucial to the need for a seniority rule 
in this model. The consequences of relaxing it are considered in Sec. IVC. 

8 This assumption is initially imposed in order to avoid developing a model that relies 
heavily on transactions that appear to be rarely observed in reality. Indeed, of the two 
most obvious ways of making such transfers, one—union dues—is severely circum¬ 
scribed by provisions in sec. 8(b)5 of the National Labor Relations Act, and the other— 
unfunded layoff pay—is rarely used (Tracy 1986, n. 6). Surprisingly, however, the no¬ 
transfers assumption turns out not to be crucial since proposition 4 indicates that the 
results are unlikely to change if free access to transfers is allowed. 
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ments. The “natural” assumption here seems to be the following: 
Each worker, z, has a property right to all payments made by the firm 
for the right to use his services. In the basic model with no transfers, 
this assumption implies that 


w(z, 0) 


= (°<««» 


if worker z is unemployed 
if worker z is employed. 


(4) 


In other words, worker z’s total income in state 6, u»(z, 0), equals the 
amount paid by the firm to worker z in state 0, w(z, 0), which is also the 
entire amount paid by the firm for the right to use his unit of human 
capital. 

Finally, it is perhaps worth noting here that one restriction not 
imposed in the basic model is an ex ante expected profit constraint for 
the firm. Thus the basic model analyzes a situation in which the firm 
has so little bargaining power that the union effectively chooses the 
seniority-wage profile w(l) unilaterally. Since the union can never 
extract all the firm's ex ante rents here (because of asymmetric infor¬ 
mation) and since the firm is free to shut down ex post, this never 
violates individual rationality for the firm. 

Given all the assumptions of the basic model, the union’s maximiza¬ 
tion problem can now be written as 9 


max < W = [ A{z, p(t(z))w(l(z)) + [1 - p(l(z))\w}dz (5) 

w{l),Ut) Jxm 0 


subject to 


w(l) ^ w, VI, 


( 6 ) 


where p(l ) is the employment probability of worker l given the entire 
seniority-wage profile and given optimizing behavior by the firm, and 
w = <3 is the opportunity cost to the union of each unit of human 
capital sold to the firm. 


III. The Basic Model: Results 

It is convenient to characterize the solution to (5) and (6) in two 
stages: first by characterizing the optimal seniority-wage profile, w(l), 
conditional on a given assignment of seniority indexes, and then by 


9 Interestingly, if the union maximizes rents, it is easy to see that its optimal policy 
derived by solving the maximization problem (5)-(6) is also the best it can do via any 
mechanism, given the information constraints of the problem. This result is well known 
from implicit contract theory (see, e.g., Hart 1983; Rosen 1985) and can be shown using 
the revelation principle by deriving the optimal direct, incentive-compatible mecha¬ 
nism and showing that it can be achieved via the appropriate outlay schedule u>(l). 
Unfortunately, this result does not appear to hold in general, suggesting that better 
(but likely extremely complicated) mechanisms than the one considered here exist. 
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considering optimal overall policy, with both w(l) and l(z) endogc 
nous. Thus, until we return to the optimal l(z) in proposition 5 belov 
(5) will be replaced by 


max °W • f a{l,p(l)w(l) + [1 - p(l)]w}dl, 0 

uii) J/-o 


which conditions on a given l(z) and has ag 2 0, a 2 2 — 0. 

The implications of profit maximization by the firm for p(l) in (7 
are summarized in the following proposition. 

Proposition 1 . U- and B-intervals .—For any w(l) that is continuou 
from the left, the employment probability of worker l will be given t 
either (i) the probability that his own value of marginal produc 
(VMP) exceeds his own wage, that is. 



M(0*(O), 


(f 


or (ii) the probability that the combined VMPs of the workers in som 
interval (l, /], which contains /, exceed their combined wages, that L 



* M(6*(D), 


for l<lsj. 


(9 


Proof. See the Appendix. Q.E.D. 

If worker l’s employment probability is given by (8) in propositio 
1,1 shall say he belongs to a U-interval, for “unrelated” workers. Thi 
is because, as is shown below, the optimal wage for worker l is inde 
pendent of wages charged for all other workers in the interval. Th 
other distinguishing characteristic of U-intervals is that, throughout: 
U-interval, d%*/dl > 0, where 0*(/) is defined as the lowest state i' 
which worker l is employed. Thus employment adjusts cominuousl; 
to the state, and each worker in the interval has a different employ 
ment probability. 

If worker I's employment probability is given by (9), he is said to b< 
in a B-interval, for “bundled” workers, since all workers in the inter 
val share the same employment probability. Thus, over B-intervals 
dQ*/dl = 0, and employment adjusts discontinuously to the state. “In 
terior” B-intervals occur when wages decline rapidly or discontinu 
ously with l, as is shown by figure la (where area A - B) and by th 
second B-interval in figure lb, respectively; B-intervals can also occu: 
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when 9 is at an endpoint and, for example, p - 1, as is shown 
first bundling interval in figure 16. 10 

Corollary. The union’s optimal wage and employment 
within any B-interval or U-interval is independent of the choi 
makes anywhere outside that interval. 

Proof. To see this, simply write out the expression for the ui 
maximand, (4), for the interval in question, as is done in (10) an< 
below. Q.E.D. 

Proposition 1 and its corollary mean that it is legitimate to com 
the union’s optimal seniority-wage policy for a given /(z) by con? 
ing optimal policies in U-intervals and B-intervals in isolatior 
then examining when and where each type of interval will c 
which I do in turn below. 

A. Optimal Policy in a U-Interval 

Within a U-interval. I E (l, 1], the union’s objective function is^ 

v - £,«('• w 

Pointwise optimization of (10) with respect to w(l), assuming th; 
individual rationality constraint does not bind, yields the first-i 
condition 


Htw] = w,) " "HtwHw]' vi ' 

The optimal policy summarized in (11) has three key properties, 
it never violates workers' individual rationality constraints (this i 
ily ascertained by considering cfW/dw at u> = w); thus these consti 
are ignored in U-intervals henceforth. Second, if worker / is in 
interval, the optimal wage for worker / is independent both c 
form of the union’s welfare function <*[/, £/(/)] and of the \ 
charged for all other workers. The third property is apparent 
we consider the following problem: 

m “ ( " - 

that is, choose the single monopoly wage that maximizes work 
expected utility conditional on his seniority assignment, which h: 

10 Figure ll> also shows how, when w(l) jumps upward, the firm’s optimal ei 
ment level may be independent of 0 for some 6, both on the interior of U-intei 
well as at the boundary between intervals of different types. 

11 It is convenient to treat both U- and B-intervals as half open throughc 
analysis. 
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same first-order condition as (10) for each l. Thus the union’s optimal 
wage for worker z is identical to the wage he would choose to set for 
himself, conditional on his seniority level, if bundling is ruled out. 

The second-order condition for a maximum of (10) can be written 
as 

2 m 2 + Mm' > 0. (IB) 

I shall assume henceforth that the distribution of 6 satisfies this condi¬ 
tion globally, which (among other things) rules out the possibility of 
upward discontinuities in w(l) in the interior of U-intervals, such as 
that shown in figure 1 b. Condition (13) is equivalent to a condition 
often used in auction theory, which is known as an increasing second- 
value statistic,y(6). It is satisfied by a large family of distributions (see 
McAfee and McMillan [1987] for a more detailed discussion) and is 
also implied by the nondecreasing hazard condition invoked later in 
this section. 

B. Optimal Wage Policy in a B-Interval 

Suppose that the union chose to implement a “bundling” policy for 
some “interior” group of workers (l, /]. 12 What is the optimal wage 
and employment policy in such an interval? The union’s maximiza¬ 
tion problem in this interval is now 

max¥ = f a{l, [u*/) - 5>]Af(0*)}rf/ (14) 

U’tO.®* J/-/ 

subject to 

f w(l) = 6* f /(/), 

J/-/ J/-/ 

w(s) a 0* f f(s), Vie {l, 1], 

-/ is-/ 

and 

w{l) a w, V / 6 (l, l], (17) 

Condition (15) is an isoperimetric constraint giving the (common) 
employment probability, 6*. of all the workers in the interval, as 
stated in proposition 1. It guarantees that the firm is indifferent 
between i and l in state 0*. Condition (16) incorporates the fact that, 

'* The analysis of “comer" B-intervals is identical except for a binding inequality 
constraint on 0, and it is omitted to save space. 


(15) 

(16) 
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if (l, 1} is truly a single bundling interval, the firm must (at least 
weakly) prefer its endpoint to any interior point. It is easily shown to 
be nonbinding in the optimal policy if seniority assignments, l(z), have 
been optimally chosen. Individual rationality is incorporated in (17). 

Thus (14)—(17) can be solved by maximizing 


°W 


r (<*{/. [«</) - ®]M(e*)} + A[ 8 */(/) -w(l)] + 4>(/)N0 - 

ll-l 


w])dl. 

( 18 ) 


First-order conditions are 


« 2 {(, MO - 5tytf(e*)}M(e*) - a + <j »(0 = o, 'ilea, h 09) 
- f ot 2 {-}MO - w]m(d*)dl + A f f(l)dl - 0, ( 20 ) 

Ji-i Jr—i 

<W0M0 - ®] = 0, v l e (/, /]. (21) 


Solving for a 2 {*} and substituting (19) and (21) into (20) yields 

f M0 - w)dl 

m*) - «<e*) ^-7-, (22) 

f fW 

Ji-i 

which is the same condition as (11), except that it applies to the entire 
group of workers in an interval. 

Thus employment policy in a B-interval is determined purely by 
the efficiency condition (22), which is the condition for maximization 
of union rents subject to the constraint that all workers in the interval 
(/, l] be assigned the same employment probability, or the same 0*. 
This efficiency condition uniquely determines the wage bill f w(l) 
that can be charged for all workers in (l, /]. Wage policy on the 
interval, given this total wage bill, the individual rationality constraint 
(17), and constraint (16), is then dictated purely by distributional 
considerations. 


C. Optimal Overall Wage Policy 

In this subsection I present several main propositions concerning the 
union’s optimal overall policy. It is helpful to begin with a definition. 

15 Suppose, with Hz) as well as wage policy chosen optimally, that there was a B- 
interval tn which, for some l, (16) was binding. Now consider reassigning seniority 
indexes only among workers in this interval but keeping each individual's wage the 
same. This leaves (22) unchanged, affects nothing outside the interval, and can always 
be done in such a way that the unconstrained solution to (19)—(21) is feasible and there¬ 
fore constitutes a superior policy. 
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Definition 1. Wage Trends over B-Intervals. —Consider a B-interval 
(l, /]. Wages will be said to increase on average with seniority over this 
interval if and only if u>(l) > Gi > lim*_/* w(l), where w is the average 
wage of all workers in (Z, Z], that is, S' = [1/(Z - Z)J ft-i w(l)dl. 

Proposition 2. Optimal Wage Policy. —Wages will increase mono- 
tonically with seniority in U-intervals and will increase “on average” 
with seniority over B-intervals if the following condition holds: 

(bn 2 + Mm + 0Mm' > 0. (23) 

Proof. In a U-interval, it is easy to show that 

dw _ 6m 2 + Mm + QMm' /aAx 

df " 2 m* + Mm' ’ U ’ 

where the denominator is positive by the second-order condition, 
(13). In a B-interval, rewrite (22) as 


M [y) ~ (® “ w)m(y) -j, (25) 

with/defined analogously to w. Now since J(l) >f> lim*_/* /(/), then 
w(l ) < w < lim 1—1+w(l), provided (23) holds globally. Q.E.D. 

Is the condition for wages to rise with seniority, (23), likely to be 
satisfied? A sufficient condition for (23) to hold is m 2 + Mm' a 0. This 
can be expressed as («Z/d0)(m(0)/M(0)] a 0; that is, the hazard function 
is nondecreasing. This condition is satisfied globally by a large num¬ 
ber of distributions, including the normal, exponential, and uniform. 
It is of course also satisfied locally for any distribution whenever m' 2 : 
0, as well as at 0, where M — 0. Thus it is likely to be satisfied at least 
for some values of 0, in most realistic cases. 

The intuition behind the role of the hazard rate in determining the 
slope of the wage profile is apparent when (11) is rewritten as 


m = 


w(8*(/)) . 
Af(0*(/)) 


[w{l) - u>]. 


(26) 


The left-hand side of this expression gives the marginal effect of a 
higher “critical” state, 8*. on the wage worker / receives when em¬ 
ployed (from 8*/(/) = w(l)). The right-hand side gives the marginal 
reduction in rents induced by a higher 0*, conditional on being em¬ 
ployed initially. When / rises in (26), the marginal direct gain to a 
higher 0* falls (/(/) diminishes). If the hazard rate, m/M, is nonde¬ 
creasing, then because 0*(Z) is monotonically increasing, the optimal 
wage, w(l), must fall with Z. 

One important implication of proposition 2 concerns the value of 
the seniority rule to the union. Since a seniority rule is always needed 
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to implement B-intervals and since it has positive value in U-intervals 
when (25) holds and zero value otherwise, it is clear that a com* 
prehensive seniority rule (covering all workers in the union) has value 
to the union under fairly general conditions. 

Another implication of proposition 2 is that the average union- 
nonunion wage differential should move countercyclically, that is, fall 
with 8, regardless of whether B-intervals occur or not. To see this, 
note that the nonunion average wage is constant over the cycle at ft), 
that the unionized firm’s total wage bill W(L ) = J%«it>(/)d/ increases 
at a decreasing rate with employment in U-intervals and on average 
over B-intervals, that employment is always procyclical (from lemma 
A3 in the Appendix), and that only the endpoints of B-intervals are ever 
realized as optimal employment levels. Thus the quantity discounts 
implemented by the union will manifest themselves in higher average 
prices paid by the union firm for labor in bad slates of nature. 

Proposition 3. Underemployment .—Regardless of whether or not 
bundling intervals occur, the union’s optimal employment policy has 
the following properties: Relative to a competitive firm, a union low¬ 
ers the equilibrium employment probabilities of all workers with com¬ 
parable seniority except (i) the least senior worker ever hired in the 
union firm, whose employment probability is unchanged; (ii) (per¬ 
haps) the least senior worker in any bundling interval, whose employ¬ 
ment probability may be unchanged by the union; and (iii) (possibly) 
some subset of the set of workers who are employed with certainty in 
the nonunion firm, if this set is nonempty. Some of these latter work¬ 
ers may remain employed with certainty under the union, but if such 
a group exists, it is smaller in the union. 

Proof. See the Appendix. Q.E.D. 

An interesting implication of proposition 3 is that, regardless of 
how much a union “cares” about its senior members, it is never op¬ 
timal to assign them higher employment probabilities than they 
would have, with identical seniority ranks, in a nonunion setting. This 
occurs because higher wages per worker (which result in lower em¬ 
ployment probabilities) are the only way the union can extract surplus 
from the firm in this model and because there is always a positive 
marginal gain from extracting the first unit of surplus earned on each 
worker. It is therefore not an artifact of the risk-neutrality assump¬ 
tion for workers, as is shown in Section IVB below. 14 

M This seemingly counterintuitive result needs to be interpreted with some care. In 
particular, even assuming for the moment that seniority ranks are assigned according 
to years of service (which has not yet been demonstrated), it does not necessarily imply 
that a nonunion worker with 20 years’ service has more job security than a union 
worker with the same level of service, for two reasons. First, a seniority rule may not be 
used in nonunion firms, leaving all workers at equal risk of a layoff. This average layoff 
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A second implication of proposition 3 is that the level of employ¬ 
ment in union firms is likely to be more sensitive to the “business 
cycle,” 0, than in nonunion firms, in the sense that union employment 
equals nonunion employment in the best state but that the union 
extends the range of equilibrium employment levels downward and 
lowers every worker’s employment probability. 

Proposition 4. Efficiency of the Optimal U-Policy. —If the second- 
order condition (13) holds globally, a rent-maximizing union (i.e., 
with A 12 = A 22 = 0) will not implement any interior B-intervals. 

Proof See the Appendix. Q.E.D. 

Proposition 4 has two main implications. The first is that, unless a 
union strongly favors certain members relative to others, its distribu¬ 
tional preferences (among all workers at some risk of a layoff) will not 
affect its optimal wage profile. To see this, recall that the optimal U- 
policy is independent of the union’s distributional preferences. Thus 
changes in union preferences away from rent maximization will not 
affect the union’s optimal policy unless they are large enough to in¬ 
duce a change of regime and cause bundling of workers. Although 
the “size" of such changes in preferences is hard to quantify, it is clear, 
at least in the two-worker case examined by Kuhn and Robert (1986), 
that they must typically be nonmarginal in nature to induce bundling. 
It thus seems likely that unions without access to internal transfers will 
impose the same wage profiles on firms as unions with unlimited 
access to transfers (A J2 = A 22 = 0). 

Second, the union’s optimal wage schedule is likely to have the 
following property: Suppose that each worker z had the right to uni¬ 
laterally choose the union’s wage profile, w(l), subject only to the re¬ 
striction that the profile chosen does not “bundle” him with other 
workers. Then, because of the property noted in (12), each worker 
will choose the same wage profile, namely, the optimal U-policy. In 
this sense, the union’s policy will be unanimously supported by its 
members. 15 

Proposition 5. Assignment of Seniority Indexes. —If there are no bun¬ 
dling intervals in the union’s optimal policy, then in the basic model 
the optimal seniority-wage profile, w(l), is independent of the senior- 


rate in the nonunion sector could easily be above that of senior, union workers. Second, 
it is important to note that the layoffs modeled here are only temporary in nature: all 
union members are permanently attached to the firm by assumption here, giving them 
considerable job security in a different sense of the term. 

15 The unanimity property does not imply that the optimal U-policy is a Nash equilib¬ 
rium in wages. This is because, in the continuum-of-workers case, raising (say) worker 
/’s wage above the optimal U-profile, holding all other wages constant, automatically 
bundles him with some group of workers with less seniority. This will typically be 
desirable for worker /. 
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ity assignment rule, l(z). The optimal assignment of seniority ranks is 
dictated purely by distributional considerations. 

Proof. Thai follows trivially from the independence of the optimal 
wage profile in a U-interval from the shape of a{/. U(l)} and from the 
fact that all workers are equally productive. Q.E.D. 

Some additional insights into the nature of the optimal assignment 
of seniority ranks can be gained by considering the case in which 
An = ot 2 2 * 0. This means that union welfare is linear in individual 
utility, and A 2 (z) = a 2 (/(z)) can be interpreted as worker z’s welfare 
weight. Since with the envelope theorem it is easy to show that dU/dl > 
0 in a U-interval, dearly the union's optimal policy here is to choose 
/(z) such that a 2 (0 is monotonically decreasing in l. In other words, the 
union simply gives the best jobs to those workers it “cares" about most, 
with no implications for productive efficiency. The best jobs are as¬ 
signed to the workers who have been with the union the longest 
(lower z’s) only to the extent that the union places a higher weight on 
their utility than on other workers' utilities. 


IV. Extensions and Alternative Formulations 

In this section I consider extensions of the basic model in several 
alternative directions in turn. For simplicity, I consider only the case 
in which the union maximizes rents (A J2 = A 22 = 0) and there are no 
interior bundling intervals in the optimal policy. This greatly simpli¬ 
fies the analysis and, given proposition 4, seems a reasonable restric¬ 
tion. 

A. Union-Firm Bargaining and Isoprofit Constraints 

In order to place the foregoing analysis into a context of union-firm 
bargaining, it is useful to sketch briefly the nature of the utility- 
possibility frontier between the firm and a rent-maximizing union 
under the nonuniform pricing technology. The general shape of such 
a frontier is shown in figure 2, which considers the case in which 
limf_o§/(0 > w; that is, it is jointly efficient to employ some group of 
workers l € (0, TJ in all states. Figure la indicates that, in this case, the 
union can extract any amount of rents up to area C without com¬ 
promising efficiency. 16 Thus the utility-possibility frontier in figure 2 

18 The result of Hall and Lilien (1979) discussed in n. 1 appears to be based on an 
implicit assumption that the union can impose fixed charges greater than area C in fig- 
la on the firm without affecting any of its employment decisions. This, of course, 
makes the utility-possibility frontier in fig. 2 linear everywhere (not just between a and 
b) and makes all possible divisions of rents between the firm and union consistent with 
efficiency in production. 
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accruing to union) 

Fig. 2.—Utility-possibility frontier between the firm and union 


has a 45° section between points a and b. If the union extracts more 
than C from the firm, the previous analysis shows that it will typically 
both (a) set wages above w for workers with / > /' and ( b ) extend the 
range of realized employment levels downward (i.e., lower /'). This 
compromises overall efficiency, giving the utility-possibility frontier a 
steeper slope than 45° between points b and c. Maximum rents are 
extracted from the firm at point c, and since of course one can always 
design wage profiles that eliminate all employment, the frontier must 
eventually return to the origin beyond c. 

For simplicity, Sections II and III characterized the behavior of a 
union that was able to achieve point c. This section considers what 
happens when the firm has sufficient bargaining power during con¬ 
tract negotiations to require the solution to be on be, by imposing a 
binding ex ante, isoprofit constraint in the optimization. 

When expected profits are constrained to equal II and there are no 
interior B-intervals, the union’s rent-maximization problem can be 
written as 


max [ (/ - w)dl + ( (w - ui(27) 

I'M!) Ji~ 0 J/—/• V// 
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( Z f (6/ - w)m(d)d@dl - n, (28) 

J/-0 k-w/f 

where /' is the lowest employment level ever chosen by the firm and / 
is the highest. 

First-order conditions for this problem with respect to w(l ) are 

(1 - - (ill -w) Zjr, (29) 

where X, the multiplier associated with the isoprofit constraint (28), is 
easily shown to be between zero and one for all solutions on segment 
be. 

Differentiating (29) yields 

m _L <0 m 2 + Mm + 0Afwt'), (30) 

aj ma 

& “ S<rV M < < 31 > 

where A > 0 by the second-order condition for a maximum. Thus 
from (30), wages will rise with seniority under exactly the same conditions 
as before (see proposition 2). Also, whenever X (an index of the firm’s 
bargaining power) is positive, (31) indicates that the union’s optimal 
wage profile is below its level in the basic model. Thus the convenient 
“unanimity” property no longer holds, and disagreement over the 
wage profile—even when all bundling is ruled out—will occur. The 
reason is that raising any worker’s wage now imposes negative exter¬ 
nalities on other workers by “tightening” the isoprofit constraint they 
face. Whether such externalities are important may well vary across 
unions and might account for some observed differences in union 
organizing costs. 


B. Risk Aversion 

In the absence of a binding isoprofit constraint, it is easy to see that 
adding firm risk aversion to the basic model has no effect on the 
results. This is because, in the absence of a binding isoprofit con¬ 
straint, the only aspect of the firm's behavior that enters into the 
problem is its ex post profit-maximizing decision, which is unchanged 
by risk aversion. Firm risk aversion in the presence of an isoprofit 
constraint makes the model more similar to implicit contracts models 
and is not considered here. 
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Worker risk aversion can be analyzed by replacing workers’ utility 
functions in (1) by 


V(u>,e) - V[to + 55(1 - *)], (32) 

where V' > 0, V" < 0, and e & {0, 1}. When the “property rights” 
assumption is maintained, worker l thus receives utility V(u>(l)) when 
employed and V(w) when on layoff. The implications of this reformu- 
lation are given by the following proposition. 

Proposition 6. Worker Risk Aversion .—The union’s optimal wage 
profile in the presence of worker risk aversion is, for all workers 
except the most junior and those employed with certainty, (i) strictly 
above the nonunion wage W, (ii) strictly below the union wage in the 
absence of risk aversion, and (iii) increasing with seniority. 

Proof. See the Appendix. Q.E.D. 

Part i of proposition 6 implies, as claimed earlier, that even in the 
presence of worker risk aversion, employment probabilities are never 
greater for a union worker than a nonunion worker of comparable 
seniority if nonunion firms were to use a seniority rule. Part ii reflects 
the fact that the net marginal gains to raising wages are lower when 
workers are risk averse. Part iii simply indicates that one of the basic 
model’s main conclusions—quantity discounts and the need for a 
seniority rule—is unchanged by the introduction of risk aversion. 


C. Firm-specific Skills 

In this subsection 1 relax the assumption that all workers are equally 
able: h(. 1, z) — 1, V z. Since hours per worker are still assumed fixed, 
the union’s labor supply technology can now be summarized by the 
function A(z): (0, Z] —» [0, A], giving the total labor input supplied by 
worker z if employed. I focus on firm-specific skills by continuing to 
assume that all workers have the same opportunity incomes, w . 1 ' For 
simplicity I continue to use a one-period model. Thus the union is 
viewed as having an inherited stock of workers with varying amounts 
of skills and as deciding how to price these workers for a single pe¬ 
riod. The dynamics of progression through seniority ranks are con¬ 
sidered in Section IVD below. 


17 With general human capital (which raises a worker’s opportunity wage propor¬ 
tionally to the amount of labor he can supply to the current firm), the union’s optimal 
price per unit of labor sold to the firm is the same as in the basic model. The only- 
difference is that some workers have more units of labor to sell to the firm than others. 
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I. Equivalence to the Basic Model 

Given a seniority assignment rule n(z), the total amount of human 
capital available when N workers are employed is now 

L(N) = f A h(n)dn. (33) 

J «-o 

Equation (33) implies the following relationship between the indexes l 
and n: 


dn _ 1 
dl ~ h' 


(34) 


which now replaces the property n = l used in Sections II and III. 
When A 12 = 4 22 = 0, the union maximizes 

'W - [ {o>(n)/>(n) + C5[l — p(n)]}dn> (35) 

Jn-0 

where w(n) is the total compensation of worker n when employed. 
Using (34) and letting u>(l) = to(l)/h(l) and w(l) = S>/A(/), we get 

W * f' {w(l)p(l) + zZ>(/)[l - p(l)]}dl, (36) 

J/-o 

where Z = L(Z) is the total amount of human capital at the union’s 
disposal. Equation (36) is equivalent to the union maximand in the 
basic model, except that W now varies with l. The union’s optimal 
wage and seniority assignment policies in the presence of heterogene¬ 
ous firm-specific skills can be found by maximizing (36) with respect 
to both w(l) and w(l), subject to the constraint that the schedule w(l) be 
attainable via some one-to-one seniority assignment rule, n(z). 


2. Optimal Seniority Assignment 

The solution to the optimization problem described above exhibits 
the following property. 

Proposition 7. Firm-specific Skills and the Seniority Assignment Rule .— 
When workers differ in their endowments of firm-specific skills, the 
union’s optimal seniority assignment rule, n(z), gives (monotonically) 
more seniority (lower n) to workers with more firm-specific skills, A(z). 
Proof. See the Appendix. Q.E.D. 

The intuition behind proposition 7 is based on cost minimization: 
regardless of the pricing schedule the union imposes on the firm, it is 
always in the rent-maximizing union’s interest to minimize the total 
opportunity cost of the amount of labor input, L, supplied to the firm. 
This always involves assigning the highest employment probabilities 
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to the workers with the most firm-specific skills. To the extent that 
firm-specific skills are determined by years of service in the firm, this 
may help explain why seniority indexes are assigned according to 
years of service in most firms. 


3. Optimal Wage and Marginal Cost of Labor 
Profiles 

In summarizing the implications of the model with heterogeneous 
firm-specific skills for wage profiles, it is important now to distinguish 
at least three separate concepts, which were all essentially equivalent 
in the basic model. The first of these is the marginal cost of labor to 
the firm, given by w(l). The slope of the marginal cost of labor sched¬ 
ule, with specific skill heterogeneity, is given by 

+ ~y -(37) 

df df dw df 2m 2 + Mm' df 

where B * (dm 2 + Mm + BMm’)/(2m 2 + Mm') > 0, which is the slope 
of the marginal cost of labor schedule ( dwldf ) in the absence of any 
specific skill heterogeneity, and dwldf < 0 under the optimal seniority 
assignment rule. Thus, for given w and/, the marginal cost of labor to 
the firm rises less rapidly with seniority than in the absence of firm- 
specific skill differences. Indeed, it appears possible that, in firms with 
considerable heterogeneity in worker skills (dwldf large in absolute 
value), the optimal marginal cost of labor to the firm can fall with the 
level of employment, even when (23) holds. This would render a 
seniority rule redundant and suggests that the value of a seniority 
rule to the union (unlike the firm) may be negatively related to the rate 
of specific skill acquisition. 

A second concept is the “markup” charged by the union on the /th 
unit of labor sold to the firm, w(l) - w(l). The slope of the union 
markup profile is given by 

J~(w -w) = B + ( — a ” -- - l\ a B. (38) 

df ' df \ 2 m 2 + Mm' I 

Thus the union charges higher absolute markups for more senior 
workers, and for given w and / the size of the markup increases more 
rapidly with seniority than in the absence of firm-specific skill 
heterogeneity. 

The last concept I consider here, and the one most easily measured 
empirically, is the total (say hourly) wage of the worker who supplies 
the /th unit of labor to the firm, given by a>(/) = w(l)h(l). This differs 
from the marginal cost profile w(l) because workers with more skills 
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supply more units of labor to the firm per hour than others. The 
slope of this schedule (for h — 1) is given by 

+ (S9) 

Thus, under an optimal seniority assignment rule, total hourly wages 
will always increase with seniority as long as m( 0) satisfies the nonde¬ 
creasing hazard condition. The rate of increase of wages with senior¬ 
ity is greater than in the absence of training, for given ui and /. 

D. Variable Hours per Worker 

In this subsection I consider a situation in which the labor supply 
technology h(e(z), z ) is fully general; in other words, hours of work r(z) 
can vary continuously for each worker, and workers may differ in 
firm-specific skills by having A, ^ 0 and A„ ^ 0. To render the hours/ 
bodies issue nontrivial, I suppose, because of fixed “setup” costs per 
worker, that average h per worker, h(e, z)/e, has a unique maximum at 
some e > 0 for each z. To focus on essentials, the analysis is condi¬ 
tional on a given assignment of seniority rights n(z). 

1. Equivalence to the Basic Model 

Suppose that, in addition to the seniority rule for workers, the union 
had access to two policy instruments. These are, first, a wage schedule 
w(n, e(n)), V n £ (0, Z], specifying the total amount the firm must pay 
worker n if it wishes to employ him for e(n) hours and, second, an 
“hours rule” e(n, L), V n £ (0, Z], V Z. 6 (0, L], giving each worker’s 
total hours when L units of labor are employed. Then we can prove 
the following proposition. 

Proposition 8. Equivalence of the Variable-Hours and Basic Model .— If 
a rent-maximizing union prices labor by the hour and can choose an 
“hours rule,” its maximand can be written as (36), which is equivalent 
to the basic model except that the marginal opportunity cost of labor, 
w, varies with l and is endogenous. 

Proof. See the Appendix. Q.E.D. 

Thus the analysis of hours of work in union firms is greatly 
simplified if unions use hours rules. Do such rules exist in actual 
unions? To see this, note that a number of common union policies 
such as minimum weekly hours per employed worker, rules govern¬ 
ing the allocation of overtime, and requirements that layoffs must 
commence when average weekly hours per employed worker fall be¬ 
low 35 (see, e.g M Slichter, Healy, and Livernash 1960, pp. 150-54) are 
all examples of hours rules. Although it is unlikely that such crude 
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rules are in general sufficient to achieve the union’s most desired 
hours/bodies mix, they may be close enough approximations to the 
efficient rule to make the present analysis relevant. Indeed, simple 
hours rules such as these are sometimes predicted by the theory (see 
below). 

2. Optimal Hours Rules 

Not surprisingly, it is easy to show that the rent-maximizing union’s 
optimal hours rule, like its optimal seniority assignment, is the one 
that minimizes the total opportunity cost of providing L units of hu¬ 
man capital (given in [A22] in the Appendix) for every L, for the 
same reasons as before. The proof is parallel to the proof of proposi¬ 
tion 6. 

What will the union’s optimal hours rule look like? To see this, 
consider first the solution to the union's cost minimization problem 
(minimize [A22] subject to [A21]) in the case of identical workers (k z 31 
0). It is then easy to show that optimal hours per employed worker are 
constant, 18 and thus the union’s optimal hours rule is simply to pro¬ 
hibit hours reductions and require the firm to make all adjustments to 
labor input on the extensive margin. Such a rule is necessary to en¬ 
sure efficiency because, in its absence, the firm will attempt to increase 
the hours of the cheaper, junior workers at the expense of the seniors, 
even though all workers are equally productive. When workers are 
not identical, more complex hours rules than those above are re¬ 
quired, in which, if specific skills increase with seniority, hours of 
employed workers are likely to be procyclical. Thus one would expect 
unions to try to impose different kinds of hours rules in firms in 
which considerable training occurs than elsewhere. Also, if hours 
variation was small compared with employment variation in most 
unions, this would indicate that relatively little heterogeneity in firm- 
specific skills existed in unionized firms. 

Finally, it is interesting to note that the basic model’s unanimity 
result continues to hold in the variable-hours model in the following 
sense. Given an hours rule, each worker knows exactly at what level of 


18 First-order conditions for a minimum of (A22) subject to (A2J) can be written as 


1 

ML) 


hMn), n) 


N ) 


«N) 


V n € (0, AT]. 


The marginal h of each worker hired at any given L equals the average h of the last 
worker hired at that L. Since this is true for worker N as well, his effort level must be the 
one that maximizes his average h , i.e., h(e)le. If all workers are identical, this is indepen¬ 
dent of the identity of the last worker hired, and therefore hours per worker. r(n), are 
independent of the amount of human capital supplied, L. 
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L (total labor input) ho will be asked to offer each hour of work he 
“owns" to the firm. The wage he would like to charge the firm for 
each hour will be the same as the wage the union actually sets if 
bundling is ruled out. Thus the hours rule can be thought of as a kind 
of seniority rule for individuals’ hours of work. 

E. Many Periods, Overlapping Generations 

This subsection considers the consequences of extending the basic 
model into a multiperiod, overlapping-generations framework. It 
makes two main points. The first is that, when there is no on-the-job 
training, the union’s multiperiod optimization problem can always be 
decomposed into a series of independent static problems, each of 
them identical to the basic model. To see this, it is sufficient to note 
that here, the best a union can do to maximize the present value of 
rents extracted is simply to extract the most rents possible in each 
period. This property also applies if specific skills accumulate at an 
exogenous rate with time spent in the-union. Then, each period, the 
distribution of skill in an overlapping-generations union with a con¬ 
stant inflow and outflow of members is exogenously determined, and 
the union’s dynamic problem consists of a series of static problems 
with heterogeneous firm-specific skills, of the type analyzed in Section 
IVC above. Thus, in certain circumstances, the basic model is easily 
generalized to a multiperiod context. However, difficulties of the sort 
analyzed in Carmichael (1983a) will arise if skills accumulate with time 
spent employed. In that case the stock of skills at the union’s disposal at 
any time depends on past union policy as well as the history of shocks, 
0. This considerably complicates the analysis and is beyond the scope 
of the current paper. One would expect, however, that such consider¬ 
ations would increase the union’s (as well as the nonunion firm’s) 
incentives to ensure that junior members are employed, relative to 
what would be optimal in a single-period framework. 

The other main point of this subsection is that, when time is explic¬ 
itly introduced into the model, the problem of inducing some union 
members to accept the undesirable seniority ranks (associated with 
high values of l) that are associated with an efficient nonuniform 
pricing policy can be considerably mitigated relative to a one-period 
world. The way to do so is, once again, to assign the most desirable 
seniority ranks at any time to the workers with the most years of 
service at that point. 

To see this, imagine an overlapping-generations union with a con¬ 
stant membership of Z, in which each member lives for a length of 
time T. There is no training, and technology is constant over time. Let 
z be an index that decreases monotonically with workers’ accumulated 
years of service at any t. This implies that, for an individual who joins 
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the union at date t ** 0, z(f) = Z - (ZIT)t, where Z/T is the rate at 
which members “progress” through the age ranks. When discounting 
is ignored, 19 this individual’s remaining expected present value of 
utility (PVU) from union employment at date t' is given by 

PVU(f') = f U*[l(z(t))}dt, (40) 

Jt ml' 

where !/*(/)—maximized utility of a worker with seniority rank /— 
and z(t) are both known, decreasing functions (see the discussion of 
proposition 5), and /(z) is the seniority assignment rule. 

To see why the assignment of less desirable seniority ranks to cer¬ 
tain union members might cause problems, recall that the union 
under consideration is a cartel of workers that could be undermined 
if enough of its members “defected” by, for example, undercutting 
the wages received by workers with more desirable seniority ranks. 
More precisely, suppose that, at any point in his life, a worker is able 
to “defect” and impose some cost, y, on the union and that the union 
immediately detects and expels all defectors. Then, clearly, the union, 
in assigning seniority ranks via l(z), faces a problem analogous to the 
firm in Lazear (1979). 20 A natural solution to this problem involves 
giving the better jobs to older workers in each period, that is. choosing 
l(z) to be monotonically increasing. Another way to see this is to imag¬ 
ine that a nonuniform pricing union was introduced into the firm at 
date zero, creating a set of “good" and “bad” jobs. If the good jobs 
were immediately given to the young members, older members would 
likely be tempted to undercut the union immediately. Such a union 
would be unlikely to survive, while a union with the opposite assign¬ 
ment rule could easily have no shirking problems at any time in its 
history. 21 

F. Layoff Pay 

By assuming no transfers and a single period, the basic model ruled 
out all payments by the firm to unemployed workers. This subsection 
considers the consequences of allowing two types of such payments. 

18 Discounting, unless it is so severe as to render future income valueless, only com¬ 
plicates the analysis without adding any new features. 

20 Note that the parallel problem of shirking by the union does not arise here, as it 
does for the firm in Lazear (1979). This is because, unlike the firm in the optimal 
contracting problem, the union does not collect "bonds” from younger workers, which 
it can then fail to pay back. It simply possesses a set of jobs of varying desirability that 
must be allocated to workers. Thus, like the firm in Carmichael (1983ft), the union has 
no incentive to default. 

21 It is worth noting that, unlike the argument based on firm-specific skills, this 
argument justifies using years of service to allocate seniority ranks even when produc¬ 
tivity differentials indicate otherwise. 
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Consider first the case of purely “funded" layoff pay. Here, in 
order to employ any worker / in period i, the firm must both pay the 
worker’s wage and make an agreed-on contribution to an unemploy¬ 
ment benefits fund. Accumulated contributions in this fund are then 
used by the union (or the firm) to make payments to unemployed 
workers, in either the current or future periods. As long as the firm’s 
contributions to the fund are contractually fixed and are counted as 
part of the marginal cost of employing worker /, such plans leave the 
conclusions of the present model unaltered. 

Now consider a simple form of purely unfunded layoff pay. Here, 
in each period, instead of w(l) the union confronts the firm with two 
schedules, s(l) and /(/), giving worker I's total compensation when 
employed and unemployed, respectively. 22 From the basic model, we 
know that the union’s rent-maximizing policy here typically requires 
w(l) — s(l ) - t(l) to be decreasing in 1. This implies an interesting test 
of the current model: If unfunded unemployment compensation in¬ 
creases with seniority, wages when employed must increase at an even 
more rapid rate. 

Finally, it is worth noting that both types of layoff pay considered 
above do not increase the union’s ability to extract rents. This is be¬ 
cause the fundamental constraint limiting the union’s ability to extract 
rents is the requirement that W(0) = 0; this constraint is not incom¬ 
patible with layoff pay and is not relaxed in the analysis above. 

G. Other Extensions 

This subsection suggests two additional extensions of the basic model 
that, while not undertaken formally here, seem promising areas for 
further research. They are (i) complementarity between workers and 
(ii) assigning wages to jobs, not workers. 

1. Complementarity 

The basic model and all extensions considered to this point assume 
that all workers hired by the firm are perfect substitutes. 

Complementarity between workers may be introduced into the 
model, for example, by proposing that the firm hires two different 
kinds of labor, L\ and L 2 , and has a production function F{L\, L 2 ), 
F 12 > 0. The two types of labor might be thought of as workers in 

28 To satisfy the constraint 1V(0) = 0,1 also assume that, whenever total employment, 
L, falls below some minimum level, layoff pay, t(l), of all workers goes to zero, While 
such a minimum L must always exist, L‘ may be sufficiently tow that it is never realized 
as an optimal employment level. 
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different departments of the firm, whose outputs both contribute to 
the final product. It seems likely that, depending on the nature of the 
unobserved shocks affecting the firm, a union might prefer either to 
impose separate seniority ladders for the two departments (which 
allows the firm to control the relative size of the two departments) or, 
alternatively, to use “plantwide” seniority, which could be used by the 
union, with the right assignment of seniority indexes, to control rela¬ 
tive employment levels in the two departments. Since both kinds of 
systems do exist, it may be interesting to derive conditions under 
which one is preferred to the other. 

2. Wages for Jobs 

In most union contracts, the wages the firm pays any given worker 
depend not only on his seniority but also on the job or “task” to which 
he is assigned. Can this feature of actual union practice be analyzed 
using the present model? To illustrate how this might be done, con¬ 
sider a simple example. 

Suppose that the firm’s marginal product of labor schedule, /(/), 
actually consists of a series of tasks, or jobs, each of which requires one 
worker. The function /(/) orders jobs in terms of declining productiv¬ 
ity. In this interpretation of /(/) the basic model corresponds to a 
situation in which the union attaches wages to workers, but the firm, 
once it has hired the desired number of workers, is free to assign 
these workers to tasks. Naturally, it always fills the highest productiv¬ 
ity tasks first in every state. 

In this world, can the union gain anything by attaching wages 
directly to tasks? Interestingly, the answer may be yes if the un¬ 
constrained solution to (11) ever exhibits dQ*/dl < 0, which essen¬ 
tially entails assigning higher employment probabilities to lower- 
productivity tasks. That is infeasible when the union assigns wages to 
workers (instead, it leads to bundling) but is feasible when wages are 
assigned directly to tasks. Thus a wages-for-jobs model of unionism 
seems a promising topic for further study. 


V. Summary 

The basic model in this paper showed that even if a union is com¬ 
posed of identical, perfectly substitutable workers, is totally indiffer¬ 
ent to the distribution of rents among its members, and operates for 
only one period, it will typically wish to implement a wage and em¬ 
ployment policy with several features of observed seniority systems. 
More specifically, the union assigns each worker to a position in a 
queue that determines the order in which workers are hired and laid 
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off. A worker’s position in this queue is called his “seniority rank." 
The union’s optimal policy involves charging the firm higher wages 
for workers with more desirable positions in the seniority queue, in 
the sense that these positions correspond to higher priority in hiring. 
As a result, union employment is typically below and more cyclically 
variable than employment in a nonunion firm with the same technol¬ 
ogy, and the union-nonunion average wage differential moves coun- 
teiryclically. All these properties are characteristics of an optimal 
nonuniform pricing system analogous to that studied previously in 
goods markets. 

The basic model also yielded some interesting insights into union 
politics. In particular, the union’s optimal pricing policy has the sur¬ 
prising property that, within limits and conditional on each worker’s 
assigned seniority rank, it will be unanimously approved by all union 
members. In the basic model, conflicts of interest between union 
members involve only the allocation of seniority ranks to workers; 
indeed this allocation is determined purely by distributional (as op¬ 
posed to efficiency) considerations there. 

Several extensions of the basic model were examined. One impor¬ 
tant result of these exercises is that, when time and firm-specific skills 
are added to the model, two efficiency-related reasons emerge for 
assigning higher seniority to workers with more years of service. 
These involve the efficient utilization of firm-specific skills and the 
maintenance of incentives of members not to “defect" from the union 
by undercutting each other’s wages. Other extensions considered 
binding isoprofit constraints (the main effect of this change is to re¬ 
move the unanimity result), worker risk aversion (this lowers the op¬ 
timal seniority-wage profile), variable hours per worker (this creates 
the need for hours rules such as union restrictions on hours reduc¬ 
tions), and layoff pay (perhaps surprisingly, this does not make a rent- 
maximizing union any better off under the assumptions of the 
model). Certain extensions also suggested promising areas for further 
research, namely, the dynamics of skill accumulation, complementar¬ 
ity between workers, and wages for jobs. 

Finally, some remarks about testing the nonuniform pricing model 
of unions are in order. Since the ultimate usefulness of the present 
model depends on the outcome of such tests, it is important that they 
pay careful attention to two problems that are likely to arise. One of 
these is the possibility of differences in the rate of firm-specific human 
capital formation between union and nonunion firms. These should 
be taken account of in drawing union-nonunion comparisons since, 
for example, if skills accumulate more rapidly with tenure in non¬ 
union firms, observed union tenure-wage profiles need not be steeper 
than nonunion profiles, as the basic model predicts. 
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The second complication that arises in the testing process concerns 
the issue of endogenous union incidence. To the extent that firms 
differ in their underlying technologies (i.e., the function F(L) and the 
distributions of 0 with which they are endowed) and that unions fare 
better in certain types of firms than others, the predictions derived in 
this paper—which assume identical technologies in union and non¬ 
union firms—again are not direcdy amenable to testing on cross- 
section data. Solving this problem requires modeling the process of 
union formation and management opposition simultaneously with 
each individual union’s nonuniform pricing problem. 


Appendix 

A. Proof of Proposition 1 

Proposition 1 summarizes the implications of the firm’s profit-maximizing 
employment decisions for the employment probability of a worker as a func¬ 
tion of his seniority rank, p(l). The proof below applies to all wage profiles 
w(l) that are continuous from the left. It is constructive: lemma A1 establishes 
the firm’s optimal employment rule while the remaining lemmas draw out its 
implications for p(l). Proposition 1 consists of the last two lemmas. A5 and 
A6. 

Definition Al. When faced by a given wage profile, w(l), the firm’s profit 
over the interval (/, /] in state 8 can be written as 

ir(/,/,8) - f {©/(/) - w(l)]dl. (Al) 

J'-i. 

It is easy to see that it (l, l, 0) is strictly increasing in 0. 

Lemma A1. The quantity /* is an optimal employment level for the firm in 
the state 8 if and only if 

ir(l, t*. 8) s- 0. V l < l*, (A2) 

IT(/*, /, 0) S 0 , V / > /*, (AS) 

0/(/*) 2 : w(l*). (A4) 

Proof Conditions (A2) and (AS) follow directly from the definition of max¬ 
imum profits, ir(0, /*, 0) as w(0, l, 8), V l, by subtraction. Condition (A4) 
follows from continuity of f(l) and left-hand continuity of w(l): If 0/(/*) < 
w(l*), then profits can be increased by lowering l*. Note, however, that 
0/(/*) > w{l*) does not necessarily imply the converse, as is apparent in figure 
1<>, state 0*. 

Assumption Al. Whenever the firm is indifferent between two or more 
levels of l, it chooses the largest of those levels. 

Definition A2. Let f*(8) be the function that maps each state 0 into the 
largest optimal level of l for that state. 

Lemma A2. A worker, l, is employed in state 0 if and only if /*(0) 2 : /. 
Proof. This follows directly from the definition of the seniority rule and 
assumption Al. 
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Lemma AS. If a worker is employed in state ft, he is also employed in all 
states 0' > ft. in other words, /*(8) is nondecreasing. 

Proof. Recall that i*(0) is optimal in state ft, and consider another state 0' > 
ft. Also consider an arbitrary worker / js /*(ft) who was employed in 6 and the 
profits earned on the interval (t, t *] in the two states. 

In state ft, because /*(0) is an optimum, ir(/, l*, ft) & 0, by (A2). Also, because 
-ir is increasing in 8, ir(/, (*, ft') > ir(l, /*, 6) a 0. Now if l were an optimal 
employment level in state ft', we would require ir (l, l*, ft') s 0 (by [A3]), which 
contradicts the statement above. Therefore, no l < l*(8) can be an optimal 
employment level in a state ft' > ft. 

Definition A3. Define 8*(/) as the lowest state in which worker / is em¬ 
ployed. Since, by lemma A3, he is employed in all higher states, we may write 
his employment probability as simply p(l) =» Af(0*(/)). 

Lemma A4. 6 *(l) is nondecreasirsg. 

Proof . This follows directly from lemma AS. 

Definition A4. A worker, /, is defined as belonging to a "bundling interval” 
if and only if his "critical state,” ft*(/), and hence his employment probability, 
is the same as some other worker's. 

Lemma A5. If a worker, l , belongs to a bundling interval (/, 1 ], his critical 
state. ft*(0, is given by 



where l < l s /. The quantities l and / are the bounds of the bundling interval 
in which worker / is contained. Note that all workers in this interval share the 
same 0*(/). 

Proof. If worker l' shares the same value of 0* as some other worker, l", 
then, by lemma A4, all the workers between l’ and l" must share that value of 
ft*. In other words, l’ and l” are on a flat section of 0*(l). Consider now the 
upper and lower bounds of this interval, l and l. By definition, /*(0) jumps 
upward, from /to I, at the point 0 = 0*(/), where l<lsl. In state 8*(/), the 
firm is therefore indifferent between / workers or / workers. This implies, 
from (A2) and (A3), that (A5) must hold since both l and / are optimal levels 
of 1. 

Lemma A6. If a worker, l , does not belong to a bundling interval, his critical 
state, 0*(/), is given by 0* = w(l)!f(l). 

Proof. If a worker, l, does not share a critical state, ft*, with any other 
workers, then we know from lemmas A3 and A4 that / must be the firm’s 
optimal employment level in some state, 0*(/). By (A4), this implies ft*(/)/(/) ^ 

Ml)- 

Now suppose that the preceding inequality were strict, that is, 0*(i)/(O > 
w(l), and consider what happens to optimal employment as the state is low¬ 
ered infinitesimally below 0*(() to 0'. Since 0*(() is defined as the lowest state in 
which l is employed, optimal employment must fall, and therefore l = l*(9(l)) 
must violate one of the conditions for an optimum, (A2)-(A4), in state ft'. It 
will not violate (A3) since ir increases in 8. It will not violate (A4) because the 
inequality was strict in state 6*(l). It must therefore violate (A2); that is, we 
must have ir(/', /*(0*(/)), 8') < 0 for some /' < /*(8*(/)). This is possible only if 
(A2) held with equality for /' in state ft*, that is, ir(P, (*(0*(()), ft*(0) = 0- But 
this means that worker t' is bundled with worker l, which is a contradiction. 
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Therefore, the inequality cannot be strict, and we must have 6*(/) = 

Q.E.D. 

B. Proof if Proposition 3 

Focus initially on worker Z, defined by 8/(0 = IS, and recall that M( 8) = 0. 
This is the least senior worker ever hired in a nonunion firm; it is clear from 
the first-order condition (11) that if this worker belongs to a U-interval in the 
union firm, then ui(Z) = IS, and thus he is also the least senior worker ever 
hired in a union firm. If, on the other hand, l belongs to a fi-interval (Z, 0, 
note first that he must be the least senior worker in that interval, that is, 1 = 1. 
because inducing the firm to hire Z > Z in state 8 would violate the individual 
rationality constraint. Finaijy, note that 8* on the interval (Z, Z] must equal 
f, the same as if worker Z was in a U-interval for the following reason: 
Z is an optimal employment level for the, firm_in state_8*, which requires 
(from [A4]) that 8*/(Z) a: IB, or 8* a IS//(Z) = 6. Since 8 is the upper limit 
of 8, the equality must hold. 

Focus now on an arbitrary worker Z < Z. I show below that (i) if Z is in a U- 
interval, p(l) is always strictly lower in the union, and (ii) if Z is in a B-interval 
(Z, Z], p(l) is strictly lower in the union unless Z = Z, when p(l) may be the same 
in union and nonunion firms. 

i) In a nonunion firm, 8*(Z) = IS//(Z). In a union, if this worker belongs to a 
U-interval, he has 8*(Z) = w(l)/f(l). But the first-order condition requires 
w(l) > ui except for the most junior worker ever employed (where Af = 0); 
thus 8*(Z) is strictly greater in the union firm. 

ii) If, in the union, worker Z belongs to a B-interval, we know that 0*(Z) is 
constant for Z < Z ss Z. Since 8*(Z) in a nonunion firm increases smoothly with Z, 
consider worker Z. If his 8* is higher in the union, then the 8* of all workers in 
the interval (l, Z] must be so as well. 

Now since l is an optimal employment leve] for the firm in state 0*. we have 
from (A4) that 0*/(O - u’(Z). or 8* a w(l)lf(l) a wlf(l), where the second in¬ 
equality is from the individual rationality constraint. Finally, consider any 
worker Z < Z still in the B-interval. His critical state is still 8* in the union but 
equals w/f(l) < wlf(l) in the nonunion firm as long as 8 > 8. Therefore, his 
employment probability must be higher in the nonunion firm than in the 
union firm. Q.E.D. 

C. Proof of Proposition 4 

The proof consists of two parts. The first shows that, if the unconstrained 
solution to the first-order condition for an optimal U-policy. (11), does in 
fact constitute a "feasible" U-policy over a given interval, that is, satisfies 
dQ*(l)/dl > 0, it yields higher rents than the rent-maximizing B-policy on that 
interval. Second, I show that, under the assumptions of the model, the uncon¬ 
strained solution to (11) is always a feasible U-policy in the sense above. 

i) Consider an arbitrary interval (Z, Z], together with an arbitrary wage 
profile w B (Z), which induces bundling of all workers on (Z, /], that is, induces 
<Z8*(Z)4ZZ = 0, V Z E (Z, /]. Note that this profile yields total rents of 



dl- 


(A6) 
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The profile ur* yields the same expected total rents as another profile, 
u B ’, which alp bundles all workers on (l, I] and is constructed as follows: 
fj uP{l)dl * /} uP (l)dl\ that is, the total wage bill over the interval is the same, 
and ur (l)/f(l) = 0*, V / € (/, 1], where 9* is constant over the interval. (This 
wage profile coincides with the firm’s demand curve in state 6*.) Total ex¬ 
pected rents extracted using w B (/) can be written as 

V B = j‘ lw» {l) - (A7) 

Thus the level of rents achievable via any wage profile, w B , that induces 
bundling of workers over the entire interval (/, /] can be expressed in the form 
of (A7), with defined as above. 

Now consider the optimal U-policy on (/, /) and note that by definition it 
solves 


rnaxW 1 ' = f M/) - (A8) 

« H) )i-i 1 /( 0 . 


Call the solution to (A7) and assume that it is a feasible U-policy (i.e., 
exhibits dd*ldl > 0). Clearly, since w*(l) maximizes °lf L . since (A7) and (A8) 
are isomorphic and. by (13), globally concave in wages, and since w B (/) and 
w*(l) differ (one induces d9*idl > 0, the other d9*/dl = 0), °W B < V 1- . 

Since the argument above applies to any interval and any profile that in¬ 
duces bundling, it implies that the union will never bundle workers under the 
stated conditions. 

ii) Note first that (11) can be written as 

fM(9*) m (6*/ - w)m(9*). (A9) 

Totally differentiating (A9), setting dW =0, and rearranging yields 


d9* _ M - 9*m 
df A 


(A 10) 


where 


A = 2/m + (0*/ - w)m', (All) 

Solving (A9) for (0*/ — ft) and substituting into (All) yields 

A - L (2m 2 + Mm'), (A12) 

which is positive given (13). Finally, to establish the sign of the numerator in 
(A10), note that (A9) can be rewritten as 

M - e* = - JEL < 0 . (A13) 

m / 

This implies that the numerator of (A 10) is negative, and hence d9*/df < 0. 
Since/is negadvely related to l, 9* must be strictly increasing with l in (11) as 
long as (13) holds. Q.E.D. 
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D. Proof of Proposition 6 

Under risk aversion, the optimal wage for worker l is the one that maximizes 
W = [V(i 0 ) - (A 14) 

Differentiating with respect to w yields 

^ = V'(w)M - (V(w) - • 4. (A 15) 

OW J 

which is positive when ui ^ Xu. Therefore, the optimal union wage is al¬ 
ways above the nonunion wage, Xu. Setting (A 15) equal to zero and letting 
Al a mIM denote the hazard function, we can write the first-order condition, 
(A 15), as 


V(w) - VO®) / 
V'(w) ~ M(wlf) ‘ 

This can be compared with (from [26]) 


w — Xu = 


f 

M(w/f) 


(A16) 


(A17) 


in the case of risk neutrality. Strict concavity of the utility function implies 
that the left-hand side of (A 16) exceeds that of (A17). Therefore, if AT > 0 
(increasing hazard), the union wage under risk aversion must fall below the 
wage under risk neutrality. Strict concavity of V also implies that the left-hand 
side of (A 16) is increasing in w; this fact, in combination with an argument 
analogous to that used in the discussion of (26), can be used to show dwldf > 0; 
that is, wages increase with seniority. Q.E.D. 


E. Proof of Proposition 7 

To find the optimal assignment of seniority indexes in the presence of specific 
human capital, note first that the union’s total opportunity cost of labor 
schedule is W(L) = <3N(L), where N(L) is the inverse of L{N) in (33). Also note 
that W(L) must satisfy the constraint W(L) a iV m in(£). where fV min (L) is the 
lowest feasible total opportunity cost of L units of human capital, which can be 
achieved only by assigning seniority indexes (inversely) according to in-firm 
productivity. Now suppose that we allow the union to choose W(L) freely, 
subject only to this inequality constraint. Then if the union's optimal W(L ) is 
achievable via a one-to-one seniority mapping n(z), it must he the optimal 
W(L) for all possible seniority mappings. 

The rent-maximizing union thus solves 

max r [W'(0 - (A 18) 

wtf). W(l) J1-0 L f(l) J 


subject to 


W(i) a W^i). 


(A19) 
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Defining the control variables 5) = R'' and w = W' yields the Hamiltonian: 

'% = (W - + *!• + *2® + ~ ^nun). (A20) 

To see what the optimal ® will be here, it is sufficient to note that, at the 
optimum, the inequality constraint (A19) must always be binding (this can be 
seen by deriving first-order conditions, which imply <t» > 0, V l, because Kg 
must be monotonicaily decreasing in l, given no bundling). Thus the optimal 
W(l) * ^min(l). which is of course achievable only via the one-to-one seniority 
assignment rule that assigns higher seniority to workers with the most specific 
capital. Q.E.D. 


F. Proof of Proposition 8 

Since hours are zero for all unemployed workers, the hours rule can also be 
summarized by a function N(L) giving total employment and e(n, L) defined 
only for employed workers n £ (0, N(L)]. Now, given an hours rule, the total 
human capital employed by the firm, L, satisfies 

rz ( NIL ) 

L- h(r(n, L), n)dn = h{r(n, L), n)dn. (A21) 

Jn“0 


Its total opportunity cost to the union is given by 
_ ( Nil ) 

W(L) = a?(n, L)dn. (A22) 

J*«o 

The sum of union members’ expected utilities in this case is 

°W = f* [ S (u(n, 0) - Se(n. B)]m(B)d»dz. (A23) 

J«-o Je-s 

Equation (A23) can be written, using a continuous-hours analogue of the 
property rights assumption, 


rz ri r rut) 

l)dl - S5*(», L{9)) 

Jn-oJe-» Ui-n 


\m(6)d9dz. (A24) 


Noting that J^.o n)«t(n, l)dl = l (this follows from differentiation of 
[A21]), we can rewrite this as 


re rw ft r. 

= w(l)dlm(8)dB - 

J«-« Jr—o /«-« J* 

’t 


win, L(6))dnm(0)dd 

# -° (A25) 


[W(£(0)> - W(m)]tn(9)de, 


which is just another way of expressing total rents extracted from the firm, 
(A8). The only difference between the union’s problem here and the basic 
model is again the fact that the marginal opportunity cost of labor, u>(/) «* 
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W'{L), to the union is now endogenous. The optimal hours rule is related to 
the optimal W(l) via the expression 

w(L) = [ 55r(n, L)dn = 55 f e L (n, L)dn. (A26) 

dL J„_o J*-o 


Q.E.D. 
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The Division of Labor, Local Markets, and 
Worker Organization 


James R. Baumgardner 

Duke University 


The model developed here explains differences in the degree of the 
division of labor across local markets. For example, the typical doc¬ 
tor in one local market may treat a wider range of patients’ problems 
than his counterpart in another local market. The extent of the 
division of labor depends on variables affecting the local demand for 
services. Two different forms of worker organization, cooperation 
and noncooperation, yield divergent results. For example, speciali¬ 
zation increases with the local number of producers under coopera¬ 
tion but may decrease under noncooperation. With the number of 
producers constant, local demand variables affect individual speciali¬ 
zation under noncooperation but not under cooperation 


I. Introduction 

Some doctors treat only heart disease while others treat heart disease, 
appendicitis, and cancer. Some lawyers handle only real estate while 
others handle real estate, divorces, and trusts. These examples stress 
choices in the number of activities performed. How do these choices 
differ across local markets? Can variations in the number of distinct 
activities an individual provides be explained on the basis of local 
demand variables? Will the number of local worker-producers affect 
the number of activities produced by any one of them? Will the de¬ 
gree of the division of labor depend on the organizational structure of 
the worker-producers? 

I thank all those who have provided helpful comments and discussions, especially 
Gary S. Becker, David Dranove, Daniel A. Graham. Ann E. Herington, Boyan 
Jovanovic, Sherwin Rosen, Asher Wolinsky, and two anonymous referees. The usual 
disclaimer applies. 
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I develop a model to explain the choice of' the number of indepen* 
dent services an individual provides. The model views the economy as 
a set of local markets that differ in market-specific demand shifters 
such as local population. Each income-maximizing worker can sell 
activities only to consumers who live within one local market. This 
restriction is motivated by the importance of transportation costs in 
service industries in which producer and consumer must meet for 
production to occur. 

The model is driven by the trade-off between increasing returns to 
production of each activity for the individual worker and falling mar¬ 
ginal revenue with output of an activity. The worker-producer’s in¬ 
centive to narrow the number of his activities and take fuller advan¬ 
tage of scale economies is countered by his incentive to generalize, 
reducing output per activity, which keeps marginal revenue high for 
each produced activity. 

With a monopolist worker-producer in a local market, an increase 
in local population increases specialization. More population reduces 
the problem of falling marginal revenue, and the worker responds by 
specializing into fewer activities. An increase in productive endow¬ 
ment, such as total available time, will increase the number of pro¬ 
duced activities. 

With several workers in a market, worker cooperation results in a 
productively efficient division of labor with workers segregated into 
their own subsets of the different activities. Worker noncooperation 
yields counter results. Overlap of workers across sets of activities can 
occur. Overlap is not efficient from a production standpoint (there is 
too little specialization), but noncooperation can dominate coopera¬ 
tion in consumers' surplus. 

Under cooperation, a demand-constant increase in the number of 
producers increases specialization as the activities are divided among 
more workers. With noncooperation this result is reversed: more 
workers can result in less specialization. Cooperation shows no change 
in individual specialization when a local demand variable changes 
while the number of workers is held constant. Noncooperation does 
display changes in individual specialization in response to local de¬ 
mand shifters. 

I also derive results for across-market equilibrium in which the 
number of producers in a local market is endogenous. Equilibrium 
across markets implies equal revenue per worker in each locale that 
has a worker. If each market is organized as a cooperative but cannot 
prevent immigration of producers, the elasticity of producers with 
respect to local population is greater than or equal to one. Also, a 
locale that has a greater equilibrium number of producers for any 
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reason (demand or supply variables) will display more specialization. 
With noncooperation within each locale, the results are not as strong. 

Economic attention to the determinants of the division of labor 
goes back to the classic statement by Adam Smith ([1776] 1937). Stig- 
ler (1951) examines vertical disintegration as industry output grows. 
His model uses different cost functions in different activities. My 
model has symmetric production technologies across activities. 
Becker (1981,1985), Rosen (1982,1983), Gros (1983), and Barzel and 
Yu (1984) use the concept of human capital specific to particular 
activities to generate gains from specialization. The production tech¬ 
nology of this paper can also be derived from activity-specific human 
capital arguments. 

Recent work in international trade (see, e.g., Krugman 1979; Lan¬ 
caster 1980; Helpman 1981; Gros 1983) has explained intraindustry 
trade of differentiated products. Two fundamental differences be¬ 
tween my model and the differentiated products models must be 
emphasized. First, the different activities (or products) in my model 
are different products that are independent in demand. I am not 
talking about different styles of the same product where different 
styles are substitutes in demand. Second, the central concern here is 
the fraction of these different products that a single individual will 
produce depending on various characteristics of the local market. 

I proceed as follows. Section II lays out the ingredients of the 
model. Section III considers the case of a monopolist worker- 
producer within a locale. Many workers in a market are discussed in 
Sections IV and V. Section IV assumes cooperation of local workers, 
and Section V looks at noncooperation. Both sections examine the 
within-locale determinants of the division of labor and also the char¬ 
acteristics of across-market equilibrium. Section VI concludes the pa¬ 
per. 


II. Ingredients 

A. Activities Segment 

The set of possible activities that the worker may produce is repre¬ 
sented by a segment of length one. Different activities are indexed by 
s 6 [0, 1]. These activities represent all the different activities in which 
a producer in the service industry of interest may engage. 1 

1 Presumably, some common thread on the production side defines the activities that 
make up this segment in a particular industry. For example, in the medical services 
industry, a general knowledge of anatomy or the requirement of a license is needed to 
allow production of any of the activities on the “medical activities segment." Analo¬ 
gously, there is a "legal activities segment" for that industry. 
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B. Demand 
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The demand price ( p ) for each activity on the segment depends on 
output of that activity (Q) and a vector of local market demand shift¬ 
ers (b): 

p(s) = pid(s),b), (1) 

with pQ < 0. z 

Several comments are in order regarding the two ingredients 
above. As pointed out in the Introduction, equation (1) assumes that 
demand for activity j is independent of output of other activities. 
Examples of products with approximately independent demand are 
treatments for strep, diabetes, or delivery of a baby; however, there 
may be some doctors who provide all three of these services and 
others who treat only a subset. For simplification, let us assume sym¬ 
metric demand across all the activities on the segment. These 
simplifications provide tractabiiity and sharpen attention on the main 
choice variable of interest: the range of different activities produced 
by a single worker. 


C. Production Technology 

The workers produce output according to 

q{s) - f x(s))\ (2) 

where x(s) is units of an individual’s input used in activity.? and q{s) is 
the individual’s output of $. The essential feature of (2) is that it 
exhibits increasing returns in the input. 2 3 * 5 

As on the demand side of the problem, let us assume symmetry 
across activities in the production technology. Because of the sym¬ 
metry and independence assumptions, the actual identities and rela¬ 
tive positions of activities on the 0-1 segment are of no consequence 
in this model. The length of the segment a worker chooses (his degree 

2 I will place further restrictions on the inverse demand functions later in the paper. 

These restrictions will guarantee that marginal revenue falls sufficiently quickly with 

output to meet second-order conditions. 

5 The more general form, q - Ax a , with A > 0 and a > I, gives all the qualitative 
results. One potential source of these increasing returns is a production technology in 
which output of an activity depends on both activity-specific human capital and produc¬ 
tion time. Consider the technology q{s) = m(sft(s), where m(s) is units of skill specific to 
production of s, t(s) is time used directly producing s, and > 0. The worker has a 
limited endowment of time, and time is required to produce specific skills. It follows 
that the technology displays increasing returns in time devoted to i since time devoted 
to s indudes both investment time and direct production time. See Edwards and Starr 
(1987) for a different treatment of the sources of increasing returns and gains from 
specialization. 
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of generalization) and quantity per produced activity (his intensive 
decision) are the main concerns here. 

D. Endowment of Productive Input 

The final ingredient to be described is a constraint on the worker’s 
endowment of the productive factor. This constraint is 

E 2= j A x(s)ds, (3) 

where A is the subset of the activities segment the worker chooses to 
produce, and E is the worker’s endowment. 

In summary, we have an activities segment consisting of the activi¬ 
ties (or products) that can be produced. Within a locale, demand price 
for each of these activities is falling in output of the activity and 
depends on local market demand shifters. Workers have a limited 
endowment of a productive input to allocate across different activi¬ 
ties, each of which is produced with increasing returns to scale. 

III. A Local Monopolist Worker 

The case of a worker-producer who is a monopolist in his local market 
serves as a convenient starting point. In this section we will look at the 
monopolist's income-maximizing decision and the effects of changes 
in demand shifters and worker endowment. To the ingredients of 
Section II will be added the assumption that demand becomes more 
inelastic as output rises: 

t, q «£, b) > 0, (4) 

where t| « p/QpQ — O. 4 
The producer’s objective is 

max Y = [ p(q(s), b )q(s)ds (5) 

&,x(s) JA 


4 Notice that demand elasticity has been defined so that it is a negative number. Thus 
t|q > 0 means that demand becomes less clastic as output rises. Also, it is easily verified 
that this assumption is equivalent to assuming that the ratio of demand price to mar¬ 
ginal revenue (MR) rises in quantity: 

■n<2 = fO/’Q - P(QPqu + Pti)}. 


~k: (t6t) = o^k) 7 + wl ' 


Thus > 0 d(p/ MR)/aQ > 0 for MR * 0. 
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subject to (2) and (3). The worker chooses a subset of the activities (A) 
and the amount of input to commit to each of these activities in order 
to maximize income (K). Notice that q(s) » Q(s) since the worker is a 
monopolist; hence, q appears as the argument in p(\ b) in (5). Point- 
wise maximization over the chosen A gives the necessary condition 

2x(s)[p(q(s), b) + q(s)p q (q(s), b)] - A. = 0 for all s E A, (6) 

where k is the Lagrange multiplier associated with endowment con¬ 
straint (3). 

Condition (6) is the familiar maximization condition that marginal 
revenue product equals the shadow price (X.) of the productive factor 
on the set of activities that are produced. The fundamental trade-off 
of the model appears in (6). By reducing the number of produced 
activities, output per activity is increased and so is marginal product, 
which equals 2x(s). The marginal product effect encourages speciali¬ 
zation. The counter force is that as output per produced activity 
(“intensive” output) rises, marginal revenue (in brackets in [6]) falls. 
Falling marginal revenue with intensive output encourages produc¬ 
tion of more activities (an “extensive” increase). 

Since (6) holds for all produced activities and the marginal product 
and marginal revenue functions are the same for all activities (because 
of the symmetry assumptions), it follows that x(s) = x and q{s) - q * x 2 
for all produced activities. Also, constraint (3) becomes £ = |A|x when 
binding. The objective can be restated as 

max Y m [hp(q, bty] = , b) —J, (7) 

where 5 * (A); 8 is the length of the activities set chosen and can take 
values between zero and one. It is convenient to think of 8 as an index 
of generalization. 

The first-order condition for (7) is 

-p(q, b) - 2 qp q (q, b) = 0. (8) 

The second-order condition is 

3 p q + 2 qp„ < 0. (9) 

The second-order condition requires that marginal revenue fall fast 
enough as output rises. Assumption (4) is equivalent to (9) at the 
solution to (8). 

* Differentiation of (4) shows sgnfqg} = sgn ~ Pq ~ QpQfd- From (8) and 
q * Q under monopoly, it follows that nq > 0 is equivalent to (9) at the optimum. 
Viewing the monopolist's problem in elasticity form provides further insight. Income is 
Y * 8 p(q)q, where the demand shifters have been suppressed. Given an arbitrary choice 
of 6, which implies a value of q, consider a 1 percent reduction in 8. Since q = (£/8)*, «,,* 
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Two comparative statics results of interest are the effects of local 
population and the worker’s endowment, respectively, on specializa¬ 
tion. For the general case of a change in a single demand shifter ( b ), 


d8 

db 


§ 0 as 


p„ + 2^ I 0 
- %bL g 0 


Pb 


Pi 


% § o. 


( 10 ) 


The three statements in (10) are equivalent at optimal choices. The 
second follows from the first by application of (8), while the third 
follows from differentiation of the demand elasticity. 6 The final state¬ 
ment says that a change in a demand variable that increases the elas¬ 
ticity of demand at each Q (decreases algebraically) will increase 
specialization (decrease 8). 

An increase in the local population (homogeneous consumers as¬ 
sumed) increases specialization. Letting b = Pop * local population, 
we can establish that p b and p qb in (10) are both greater than zero. 7 


= -2, where is the production elasticity of intensive with respect to extensive 
output. The 1 percent fall in 8 provides a 2 percent increase in q. If p were not affected 
by changes in q, Y would increase and one would continue to the point of complete 
specialization. However, the 2 percent increase in q leads to a fall in price by (2 x 1 /)t||) 
percent. It follows that the original 1 percent reduction in 8 will lead to an increase in Y 
iff |q| > 2. If q were at a level at which |ti| < 2, then a reduction in 8 with the resultant 
rise in q would cause a large enough fall in price that Y would fall. Condition (8) can be 
rearranged to q ■ plqp q = - 2. Income is maximized when q is such that the demand 
elasticity is - 2. At that point, the lost revenue from reducing the number of activities is 
just offset by the increased intensive revenue. Second-order condition (9) and the 
equivalent assumption (4) are more easily understood now. Maximization of income 
implies that the worker will push intensive output into a region in which demand 
becomes more inelastic with output. Violations of assumption (4) come in two catego¬ 
ries: (a) demand becomes more elastic as output rises or ( b) demand is isoelastic. In case 
a, increasing intensive output would provide a better and better trade-off of intensive 
revenue for extensive revenue. This would preclude an interior maximum; i.e., 8 6 (0, 
1). In case b, if demand elasticity were always above two in magnitude, one could always 
gain from a reduction in 8 and would be driven to complete specialization. If the 
elasticity were always below two in magnitude, one would completely generalize. 

®The partial derivative % = {qKpiq)*](P<iPt> ~ PP^)- The reversal of the inequalities 
from the second to the third line of (10) follows from the fact that p, < 0. 

7 To see this, recognize that Qo ■ Pop qo(p), where Qp is the total quantity de¬ 
manded. Pop is the population of consumers, and q D is the per capita quantity de¬ 
manded. Totally differentiating, we get 

dQj> = Pop dp + qodfop. 

dp 

It follows that at a constant Qp, dp/ dPop > 0. Also, 



Thus an increase in Pop flattens the market demand curve (d 2 p/dQdPop > 0). 
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We conclude that 36/dPop < 0. The increase in population lessens 
the effect of falling marginal revenue, which is the force favoring 
generalization. 

An increase in endowment decreases specialization. Optimal inten¬ 
sive output (q) is unaffected by a change in endowment (see [8]). The 
increased endowment goes into increases on the extensive margin (8 
rises). 8 

In across-market equilibrium with migration of homogeneous 
workers permitted, each market with a worker will have worker in¬ 
come equal to a common level ( Y). If the economy is composed of a 
smooth distribution of locales that differ in population, the monopoly 
case described here will occur only in knife-edge markets. Less 
populated locales will have no producers, while more populated lo¬ 
cales will have enough pnxiucers so that each cannot behave as a 
monopolist (i.e., N is such that Nh M > 1, where N is the local number 
of workers, h M is the monopolist choice of 8, and 1 is the length of the 
activities segment). The next two sections discuss the latter case of 
locales in which N is so large that each producer cannot replicate the 
monopoly solution. 


IV. Local Worker Cooperation 

Consider a local market with a given number of worker-producers 
(N). This section examines the division of labor properties when the 
workers within a locale behave cooperatively to maximize revenue per 
worker. Since N is taken as given by the local cooperative, total lo¬ 
cal worker income is maximized. Assume that N is large enough that 
each producer cannot replicate the monopoly situation of Section III. 
First, we will look at results within a particular locale. Then we will 
look at results in an across-market equilibrium in which producers 
freely choose a locale. 

A. Results unthin a Local Market 

Within a local market, worker-producers will be segregated across 
activities. Since the cooperative organizes to maximize total worker 
income in the locale, it will divide activities across the JV producers in a 
productively efficient manner. 

Because of the increasing returns at the individual level, efficiency 
implies no overlap of workers on the activities segment. I'o see this, 
consider some arbitrary subset of the activities (As) with length As on 

* Proof appears in sec. 1 of the unpublished appendix available from the author on 
request. 
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which two workers overlap. Say that each produced q - M 2 l(As) 2 , 
where M refers to the total amount of inputs each worker is using on 
the As activities. Total intensive output on As is Q = 2^ * 2[Af 2 /(As) 2 ]. 
Such overlap is clearly not productively efficient since the workers 
could be segregated with each producing Ai/2 activities. Under segre¬ 
gation, output per activity on As is Q = q = Af 2 /(As/2) 2 = 4[M 2 /(As) 2 ]. 

The degree of specialization of individual workers increases as the 
number of workers ( N ) rises. The optimality of segregation of work¬ 
ers across activities implies that 8 = l/N (assume producers have the 
same £). Thus 8 falls as N rises. It also follows that N constant changes 
in local demand shifters do not affect 8. 

The income-maximizing cooperative will not produce quantities 
(Q) of the activities beyond the level at which market marginal reve¬ 
nue is driven to zero. Under the segregated cooperative solution, Q = 
q - N*E 2 . Once N is large enough that hPE 2 is at a level at which MR 
< 0, the cooperative will leave some worker endowments unused so 
that MR = 0. 9 


B. Results in an Across-Market Equilibrium 


Now consider an economy composed of many local markets in which 
N is endogenous to each market. Assume that the workers within each 
market will behave as a cooperative to maximize income per worker 
but cannot prevent immigration. The homogeneous workers_will lo¬ 
cate to equalize income in all markets at some common value Y. With 
Q°(b) defined as the intensive output at which a market with demand 
shifters b has MR = 0, a market's N satisfies 


Y = bpq = 


p(N 2 E 2 , b)NE 2 if N 2 E 2 =s Q°(b) 
1 p(Q°( b), b)Q°(b) if N 2 E 2 > Q°(b). 


(H) 


Locales with greater population will have a larger equilibrium 
number of producers and also more specialized producers. More 
populated markets have greater N in order to equalize across-market 
income. Since cooperatives have 8 = l/N, the more populated mar¬ 
kets have more specialized workers. In fact, markets with greater N 
for any reason (e.g., other demand shifters or supply shifters) will 
have more specialized producers. 

The elasticity of producers with respect to local population («n.po P ) 


9 We assume that the cooperative will attempt to minimize resource use when it 
produces the Q corresponding to MR = 0. The segregated solution, 6 = l/N, will 
achieve this. 
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is greater than one until one reaches a locale with a threshold popula¬ 
tion. For markets with populations above the threshold, e^pop “ 1. 
The elasticity result is driven by the increasing returns at the individ¬ 
ual level. Consider two local markets, one with twice the population of 
the other. If the larger market had twice as many producers, coopera¬ 
tion implies that they would each produce half as many activities. 
Because of the increasing returns, the larger market’s producers 
would be able to more than double both their intensive output and 
intensive revenue. With half the number of activities but more than 
double the revenue per activity, workers in the larger market would 
make more income. This could not be an equilibrium. Equilibrium 
requires more than twice as many producers in the larger market to 
equalize revenue per producer across locales. This argument requires 
that MR > 0 in the smaller market. If MR * 0 in the smaller market, 
then only a doubling of workers need occur in the larger market since 
the cooperative would not choose to more than double Q because this 
would drive MR < 0. A formal proof of these elasticity results appears 
in the unpublished appendix that is available from the author. 


V. Local Worker Noncooperation 

This section analyzes imperfect competition among the worker- 
producers in each locale. As in the previous section, attention focuses 
on cases in which N is large enough that the workers cannot replicate 
the monopolist solution of Seccion III. Unlike the cooperative case of 
Section IV, this section assumes that each worker attempts to max¬ 
imize individual revenue subject to assumed output decisions of other 
local producers. 

Under noncooperation, each individual worker-producer chooses a 
subset of the activities segment and an allocation of inputs along this 
chosen subset. Cournot conjectures are assumed within each activity 
(intensively). A Cournot type of assumption is also made extensively. 
Individuals conjecture that increasing extensively into activities pro¬ 
duced by others will not alter the others’ extensive decisions. That is, 
if I move into one of my neighbor’s activities, I assume that he will not 
take a counter step into mine. 

Under noncooperation, overlap of workers along the activities seg¬ 
ment can occur in equilibrium. A possible configuration is illustrated 
in figure 1 (where, for clarity, the activities segment has been drawn as 
a circle of circumference one). Under Cournot, own marginal reve¬ 
nue in an activity is p(Q) + <[Pq(Q). Say that two workers (worker 1 and 
worker 2) were organized in the segregated cooperative solution. In¬ 
creased output by worker 1 in his own subset of activities will drive 
down the price of all his inframarginal units (under the cooperative 
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Fig. 1.—Each continuous segment represents one worker's set of produced activities. 
Six workers are in this market. 

solution, the qipQ term in own marginal revenue equals QPq). If, on 
the other hand, worker 1 shifts some inputs into some of 2’s activities, 
1 will ignore the fact that he drives down the price of 2’s inframar¬ 
ginal units. Worker 1 is concerned only with qipQ. not with qipQ. This 
is the source of the incentive to break from the cooperative solution 
and overlap other workers. 

We will proceed through this section of the paper in several steps. 
We will begin by stating the individual’s objective and showing neces¬ 
sary conditions on the set of produced activities. Next, the worker’s 
problem will be collapsed into an extensive-intensive decision. We will 
derive the equations describing the Nash solution within a local mar¬ 
ket. Comparative statics results will be examined. A discussion of an 
across-market equilibrium will conclude the section. 

Let us add the following demand assumptions: 

Pq + QPqq. < ( 12 ) 

and an individual’s own marginal revenue product within an activity 
(* 2 (Q/to) w [ p(Q) + (Q/m)pQ(Q )], where m is the total number of pro¬ 
ducers of that activity) is weakly concave in aggregate output (Q). 
Assumption (12) says that marginal revenue falls faster than the de¬ 
mand curve. It implies assumption (4). 10 Weak concavity of own mar¬ 
ginal revenue product will ensure a unique Q. 11 


10 Plug (12) into the expression for hq in n. 4. 

11 Weak concavity of market demand is sufficient but not necessary for concavity of 
own marginal revenue product (MRP) in Q. Note that 

a 2 own MRP m Mp a* own MR + 2 a own MR aMP + ^ own MR ^ a 2 MP 

The second term in the sum is negative (use (12)) as well as the final term. The first 
term is nonpositive if market demand is weakly concave. 
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The worker-producer’s objective b 

max [ piQjis), b )q(s)ds (IS) 

AMs) 

subject to (2) and (3). Total output of activity s, Q(s), is 

«~ 1 

Q(s) = q(s) + £ qj(s). (14) 

7-1 

The qj refer to output by other workers. There are m total producers 
of j (including self). 

A necessary condition is that own marginal revenue product is 
equal across all produced activities and equals the shadow price of 
productive input: 

2 x(s)[pi(&s), b) + q(s)P q(Q(s), b)] - X = 0 for all s 6 A. (15) 

Condition (15) is derived using the Cournot intensive conjecture 
dQ/dq - 1. Because condition (15) holds for every worker producing a 
particular activity, we have the symmetric solution Q(s) - mq(s) on 
each s. From the symmetry of demand and production functions 
across activities, it follows that, within a given local market with de¬ 
mand shifters b, individual (and total) outputs are the same in all 
activities that have the same number of producers. That is, 

X(s) = x m , q(s) = q„, Q(s) - Q„ - mq„ = mx 2 „\ (16) 

Q, q, and x vary across activities only if tn varies across them. 

Having established that, in a symmetric solution within a given 
locale, intensive levels of an activity depend on its number of produc¬ 
ers, we can rewrite a worker’s objective in extensive-intensive terms as 

,u.«3ia. - - l TT" (,7) 

where 8* b the fraction of the worker’s endowment devoted to a 
subset of length S m . The b argument has been suppressed in the 
demand function to simplify notation. The right-hand side of (17) is 
obtained through substitutions analogous to those used in (7). Objec¬ 
tive (17) is written as if the worker is free to choose among activities 
with different m’s. His choice of b m is an extensive decision that, along 
with choice of 8„, implies an intensive decision. In Nash equilibrium 
among workers within a given local market, under the conjectures 
assumed earlier, the relevant objective for an individual is much sim¬ 
pler than (17). 
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The individual objective simplifies to 


max y - 
#,»i,e.,e.6 to, i] 


P(Q.n> b)^H "F l/KQn+l» b)^ n + i, (18) 


where n is a nonnegative integer, 


- _ (*«E\ 2 - _ ro - 0n)£l 2 

~ ' §„+1 J - 


(19) 


This says that at the symmetric Nash solution the worker decides to 
allocate input over a bindingly constrained length of activities with n 
producers (8„; note that n includes oneself) and a nonbindingly con¬ 
strained region of n + 1 producers in order to maximize own income. 
The choice of 8„ +1 is an extensive decision on the n + 1 producer 
region, while choice of 6„ implies a quantity decision on both type n 
and n + 1 activities. The simplification to (18) occurs because a whole 
range of possibilities allowed by (17) are not consistent with individual 
optimization (and, thus, not consistent with Nash equilibrium). 

Section 3 of the unpublished appendix shows the reduction to (18). 
One important element in the proof shows that an individual always 
prefers putting all his resources into activities with the smallest num¬ 
ber of producers if the number of such activities is not bindingly 
constrained. If 8 n + i is not bindingly constrained, choice of 8„ +1 +k 
with k a 1 is ruled out. Thus, if one is to get production in a region of 
n + 1 workers, there must be a binding constraint on the available 
number of n worker activities (8„). The reader should also observe 
that the symmetric Nash solution cannot have workers choosing 
among bindingly constrained regions of both n and n - k workers. If 
there were sets of activities produced by n - k workers, then &„_*+ j 
would not be bindingly constrained since a worker could move re¬ 
sources into the n - k region and (counting his added presence) turn 
the region into a type n - k + 1 region. The maximization problem 
in Nash equilibrium reduces to choice over a bindingly constrained 
length and a nonbindingly constrained length, with the binding set 
having one fewer producer than the alternative set. Figure 2 dia¬ 
grams possible configurations. 


A. Results within a Local Market 

The symmetric Nash solution satisfies the following three equations: 

~p(Qn + j » b) - 2q„+ | Pq(Q„ + \, b) = 0 , ( 20 ) 




Fig. 2-— a, N = 3, n = 1. Workers face a binding constraint on 6, and a nonbinding 
constraint on 8 2 . (Small letters denote different workers.) b, N - 3, n =* 2 Workers face 
a binding constraint on 8 2 and a nonbinding constraint on 8 V 


(jr-)[/KQn. *») + q„ pqiQn, b)] 

/j _ n \ < 2I > 

- (x77)W*2» +I ’ b ) + - °> 

1 _ tf/A. + A) = o. (22) 

Conditions (20) and (21) come from the individual first-order condi¬ 
tions for (18). ,a The n subscript on 0 has been dropped to save nota¬ 
tion. Condition (20) is the multiworker analogue to (8), while (21) is 
equivalent to saying that own marginal revenue product in an w- 
producer activity equals that in an n + 1-producer activity. Condi- 

l * Both are derived using the Cournot conjecture dQIBq *» 1 and taking 8, as given to 
the individual. 
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tion (22) requires that, given N local producers, the values of 8„ and 
8 „+1 stay within the total length of the activities segment (1). 

Given b, E, and N, the system determines 5*+ u 0, and 8„. The val¬ 
ues of <j n > Qn , q n +u and Q n+ ( follow immediately using the auxiliary re¬ 
sults in (19) and (16); n is the smallest nonnegative integer that implies 
a value of 0 a 0. 13 An interior solution satisfies the three equilibrium 
conditions with 0 € (0, 1). If parameters are such that the solution 
value of 0 is greater than or equal to one, we are at a “corner” equilib¬ 
rium with all activities produced by n workers (0 = 1 and 8„ + i = 0). 

One property of the within-locale Nash solution is that Q „+1 > Q n 
while q n +i < q n - Total output per activity is larger in those activities 
produced by more workers, but output per worker is larger in those 
activities produced by fewer workers. These results and comparative 
statics are demonstrated in sections 4, 5, and 6 of the unpublished 
appendix. Only the more important results will be discussed here. 

We are interested in the effects on an individual’s degree of 
generalization (8 r ), where 

&r m S„ + 6„+i. (23) 

Comparative statics show that an increase in a population of homoge¬ 
neous consumers results in increased specialization (8 r falls when lo¬ 
cal population rises). The effect of population on specialization is 
driven by the same forces seen in the monopoly case. Under the 
Cournot conjecture within each activity, individual producers face 
downward-sloped own demand and own marginal revenue. An in¬ 
creased population lessens the restrictiveness of falling own marginal 
revenue and encourages further specialization. Intensive output q n + 1 
(and Q„ + j) rises in local population, and the number of type n activi¬ 
ties rises while the number of type n + 1 activities falls. 

An increase in £, the productive endowment per worker, decreases 
specialization (8 r rises). Greater worker endowments do not affect 
either the demand or production functions and therefore do not 
affect the optimal intensive outputs of individuals. 1 * The endowments 
go to individual increases on the extensive margin. 

Specialization falls (8 r rises) as N rises on interior solutions. The 
motivation behind this result comes through the Cournot conjecture. 

,s This follows from the earlier result (see sec. 3 of the unpublished appendix) that 
individuals always prefer production on a nonbindingly constrained region of as few 
producers as possible. Of course, if, for a given n, eqq. (20)-(22) imply 0 < 0, this means 
that no one would put resources into such activities and indeed none would exist. The 
value of n must be raised until 0 a 0. 

14 That is, the values of q m Q,, q m+ ,, and Q n +i do not change. However, the number 
of type it + 1 activities rises, and the number of type n activities falls. Thus there is a 
sense in which intensive changes occur. 
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Under Cournot, individuals face down ward-sloped own demand and 
own marginal revenue curves, which reflect the market demand curve 
minus the output of other workers. When a market’s N increases, this 
brings about more overlap on the limited set of activities. The individ¬ 
ual sees own demand and own marginal revenue shift inward and 
responds by generalizing over more activities. Market demand- 
constant increases in N cause workers to respond as if they were put 
into a less populated locale. 

The results above are for interior solutions only. At corner solu¬ 
tions, q n — (£/ 8 „) 2 , and 8 r — 8 „ is determined by (22). Changes in b or 
£ have no effect on 8 r , and N up implies 87 down. 

The global effect of increases in N on specialization is non¬ 
monotone. On interiors, specialization falls in N. When a corner is 
reached, n jumps by one, and specialization (along with intensive 
output Q n ) rises in N. As N continues to rise, we move back to an 
interior, with specialization failing in N. 

Noncooperation gives comparative statics results markedly differ¬ 
ent from those under cooperation. As we have seen, the noncoopera¬ 
tive displays changes in specialization when local demand variables 
change. The cooperative will not display a change in specialization 
unless the number of producers changes. Under cooperation, spe¬ 
cialization rises in N. On interiors, the noncooperative has specializa¬ 
tion falling in N. 

The noncooperative displays an overlap of workers that is ineffi¬ 
cient from a production standpoint. On the other hand, total output 
per activity can often be higher under noncooperation. The coopera¬ 
tive will leave inputs unused rather than let market marginal revenue 
go negative. Total output under noncooperation can occur at levels at 
which market marginal revenue is less than zero. Thus consumers’ 
surplus can be greater under noncooperation. 

B. Results in an Across-Market Equilibrium 

In an across-market equilibrium, N varies across markets so that in¬ 
come per producer is equalized across all markets that have a pro¬ 
ducer: 

Y = 8 „ p(Q n , b )q n + 8 „ + , p(Q„ + ,, b)q„ + ,. (24) 

Of course, the number of producers will be positively correlated with 
local population. The equilibrium correlation between specialization 
and population is not as clear-cut as in the cooperative case. With 
noncooperation, there are two offsetting effects on individual spe¬ 
cialization as we look across markets of increasing population. When 
changes in N are ignored, more population implies increased speciali- 
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zation. Countering this, along interiors we saw that increasing N (pop¬ 
ulation constant) decreases specialization. In an across-market equi¬ 
librium, the net effect on specialization depends on which of the 
forces dominates. 

Table 1 contains a simulation of an across-market equilibrium, 
where each row represents a local market with a different population; 
N varies such that Y is equalized across markets. In the simulation we 
see that falls (specialization rises) as we go to more populated 
markets. 15 The degree of overlap increases with population, and 
prices fall because of the increased competition associated with more 
overlap along with the economies from increased specialization. 


VI. Conclusion 

Increasing returns has long been recognized as an important phe¬ 
nomenon in economic life. 1 ® The difficulty with increasing returns is 
its seeming incompatibility with interior solutions. In the division of 
labor model presented here, we offset an increasing returns produc¬ 
tion technology with sufficiently falling demand to generate an inte¬ 
rior solution for the number of activities produced by an individual 
worker-producer. The model shows that an individual’s degree of 
specialization is sensitive not only to local demand shifters but also to 
whether or not workers organize in a cooperative manner. 
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This paper contributes a new element to the explanation of the 
Gibson paradox, the puzzling correlation between interest rates and 
the price level seen during the gold standard period. A shock that 
raises the underlying real rate of return in the economy reduces the 
equilibrium relative price of gold and, with the nominal price of gold 
pegged by the authorities, must raise the price level. The mechanism 
involves the allocation of gold between monetary and nonmonetary 
uses. Our explanation helps to resolve some important anomalies in 
previous work and is supported by empirical evidence along a num¬ 
ber of dimensions. 


Monetary theory leads us to expect a correlation between nominal 
interest rates and the rate of change, rather than the level, of prices. 
Yet, as emphasized by Keynes (1930), two centuries of data do not 
confirm this expectation. Between 1730 and 1930, the British consol 
yield exhibited close comovement with the wholesale price in¬ 
dex, alongside an essentially zero correlation with the inflation rate. 
Keynes referred to the strong positive correlation between nominal 
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interest rates and die price level, which he called Gibson's paradox, as 
“one of the most completely established empirical facts in the whole 
field of quantitative economics” (Keynes 1930, 2:198). Fisher wrote 
that “no problem in economics has been more hotly debated" (Fisher 
1930, p. 399). 

Fisher attempted to resolve the Gibson paradox by combining his 
relation between nominal rates and expected inflation with the hy¬ 
pothesis that inflationary expectations were formed as a long distrib¬ 
uted lag on past inflation. His explanation has been widely chal¬ 
lenged. Sargent (1973) noted that such a distributed lag appears 
incompatible with the stochastic process actually followed by inflation 
in the Gibson paradox period. Shiller and Siegel (1977) reported that 
movements in nominal rates during that period seem to be attribut¬ 
able to variation in real rates rather than the inflation premium. 

Keynes (1930) and Wicksell (1936) argued that shifts in the 
profitability of capital would be accompanied by accommodative 
movements in the stock of inside and outside money through the 
behavior of private and central banks. The Keynes-Wicksell approach 
founders on the observation that “neither changes in banks’ reserve 
ratios nor in the ratio of the domestic gold stock to high-powered 
money account for any sizable part of the long-run movements in the 
U.S. money stock before 1914” (Cagan 1965, p. 255). Instead, the 
dominant proximate determinant of the movement of prices and 
money during the period was variation in the stock of monetary gold. 
As Friedman and Schwartz (1976, p. 288) conclude, “the Gibsonian 
Paradox remains an empirical phenomenon without a theoretical ex¬ 
planation.” 

This paper contributes a new element to the explanation of the 
Gibson paradox. Noting the coincidence of the observation of the 
Gibson paradox and the gold standard period, we point out that 
the Gibson correlation may arise as a natural concomitant of a mone¬ 
tary standard based on a durable commodity. 1 Our theoretical expla¬ 
nation revolves around the essential nature of a metallic standard. 
Since the authorities peg the nominal price of gold at a constant, the 
general price level is the reciprocal of the price of gold in terms of 
goods. Determination of the general price level then amounts to the 
microeconomic problem of determining the relative price of gold. 
Since gold is a durable asset, its price is sensitive to the long-term 
interest rate. 

Following the treatment of the gold standard by Friedman (1953), 


1 An independent and contemporaneous contribution that also stresses the mecha¬ 
nism of the gold standard is Lee and Petruzri (1986). We point out the essential 
differences between that paper and the present one below. 
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we focus on the demand for gold in its real, as well as its monetary, 
uses. Using a model similar to that of Barro (1979), we show that if 
innovations in the productivity of capital are an important exogenous 
disturbance, as in Wicksell and Keynes, the negative equilibrium rela¬ 
tionship between the relative price of gold and the real interest rate 
can give rise to Gibson's paradox. Our mechanism, which relies on the 
allocation of gold between monetary and nonmonetary uses, is (unlike 
the Keynes-Wicksell mechanism) consistent with the stylized fact that 
prices varied closely with the monetary gold stock during the gold 
standard period. 

The paper is organized as follows. Section I refutes recent claims by 
Benjamin and Kochin (1984) that much of the Gibson correlation is 
spurious and demonstrates the temporal coincidence of the Gibson 
correlation and the gold standard. We also present evidence from 
equity yields indicating that the Gibson correlation held for real as 
well as nominal assets. Section II presents a theory of the price level 
under a gold standard and shows how the Gibson correlation arises 
naturally in such a setting. Section III shows that the inverse relation¬ 
ship between the relative price of gold and the real interest rate, 
which provides the basis for our resolution of Gibson’s paradox, is a 
dominant feature of actual gold price fluctuations in the post-1970 
period, when the nominal price of gold has floated freely. Section IV 
provides some limited evidence that two key ingredients of our the¬ 
ory, productivity shocks and substitution between monetary and non¬ 
monetary gold, were in fact important features of the gold standard 
period. Section V contains a brief summary and conclusions. 


I. Gibson’s Paradox in World Data, 1730-1938 

This section examines world data on commodity prices, long-term 
interest rates, and stock yields in an effort to characterize Gibson’s 
paradox. We address the arguments of Benjamin and Kochin (1984) 
concerning the spurious regression problem and go on to show that 
Gibson’s paradox was primarily a gold standard phenomenon. Then, 
using stock yield data, we argue that Gibson’s paradox involved the 
underlying real rate of return, and not merely the nominal yield on 
nominal assets. 

We work with both British wholesale prices 2 and a world price 
index, which is a GNP-weighted average of the wholesale prices of 

* The British price data are from Mitchell and Deane (1962) and were assembled by 
linking the Elizabeth Schumpeter index with the annual average of the Gayer, Rostow, 
and Schwartz monthly index of British commodity prices and then (beginning in 1846) 
the Sauerbeck-Statist overall price index. 
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four countries. 9 In fact, the correlation of the British price series and 
our world price level is .96, and very similar results are obtained using 
either index. We take the yield on British consols to be the world long¬ 
term interest rate. 4 


Wat There a Gibson Paradox f 

Data on world prices and interest rates for the period 1821-1913 are 
plotted in figure 1. We use the consol yield because, as discussed 
below, our theoretical model points most clearly toward a relationship 
between long-term interest rates and gold prices. In table 2 below, we 
also present results using the short-term interest rate. Although a 
clear positive relationship is observable, Benjamin and Kochin (1984) 
note that the two series are very nearly random walks, and thus the 
risk of spurious correlation is high. Granger and Newbold (1974, 
1977) show that an ordinary 1-test is very likely to show a “significant” 
relationship between two random walks, even if they were generated 
independently. These authors also show that standard procedures for 
correcting serial correlation are inadequate when the error process 
involves a unit root. 

We deal with the spurious regression problems in two ways. First, 
we run the regression in first differences, a standard diagnostic proce¬ 
dure recommended by Granger and Newbold (1977) and Plosser and 
Schwert (1978). Because first-differencing accentuates the high fre¬ 
quency variation in the data at the expense of the low frequencies 
(Anderson 1971) and because it may exacerbate the problem of er- 
rors-in-variables (Plosser and Schwert 1978), we also report regres¬ 
sions in the levels of prices and interest rates. The simulation studies 
of Granger and Newbold (1974, 1977) provide some rough guidance 
as to the correct critical levels for rejection of the null hypothesis that 
two random walks are independent. They suggest that an ordinary t- 
statistic greater than 10 or so (corresponding, with 50 observations, to 


* We construct the world price index because our model, like that of Barro (1979), 
concerns the world price level under a gold standard. The four countries are Britain, 
France, Germany, and the United States, although we exclude the U.S. data during the 
Civil War period. The weights are from Bairoch (1982), who attempts to proxy manu¬ 
facturing output of a number of countries in the years 1860 and 1913. The prices for 
France and Germany are from Mitchell (1975), while those for the United States are 
from Warren and Pearson (1933). 

* Following the suggestion of Homer (1977) and Shiller and Siegel (1977), we use the 
yields on 2.5 percent government annuities for the years 1881-88 instead of consol 
yields. During this period, yields had fallen below the 3 percent rate at which consols 
were issued, and the possibility of government redemption (which actually occurred in 
the "refunding of 1888”) kept the yields on consols from falling much further. 
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Fig. 1.—The world price level and the consol yield 


an R 2 of about .7) would properly lead to rejection at the 5-10 per¬ 
cent level. We take this as our criterion for significance of the regres¬ 
sions in levels. 

Table 1 presents Gibson regressions, in both levels and first differ¬ 
ences, for various subperiods of 1730-1938 (table 2 reports analo¬ 
gous regressions using the British open market rate of discount, a 
short-term rate given by Homer [1977]). The results justify several 
important conclusions. First, Gibson’s paradox is not an example of 
the spurious regression phenomenon, at least not during the classical 
gold standard years 1821-1913. The regressions in table 1 using the 
consol yield in differences are significant at the 1 percent level for 
the period as a whole and at least at the 5 percent level for both 
of the subperiods 1821-71 and 1872-1913. The regressions in levels 
for the whole period have 1-statistics in excess of 10. The results in 
table 2 using the short rate are somewhat weaker but consistent. 

Second, Gibson’s paradox is by no means a wartime phenomenon. 




TABLE 1 


Regression or Logarithm of Price Level on Consol Rate 
(Levels and First Differences) 


Sample Period 

Price 

Series 

Levels/First 

Differences 

Coefficients 

of 

Consol Yield 

Durbin* 

Watson 

Statistic 

JP 

1730-96 

British 

Levels 

.15 

.44 

.43 




(02) 





First differences 

-.02 

1.88 

.01 




(.02) 



1797-1820 

British 

Levels 

-.08 

. .72 

.11 




(05) 





First differences 

-.05 

1.68 

.06 




(04) 



1821-1913 

World 

Levels 

.40 

.47 

.71 




(.03) 





First differences 

.15 

1.73 

.1) 




(04) 




British 

Levels 

.38 

.44 

.65 




(03) 





First differences 

.16 

1.71 

.10 




(05) 



1821-71 

World 

Levels 

.17 

.61 

.18 




(.05) 





First differences 

.14 

1.72 

.10 




(06) 




British 

Levels 

.16 

.52 

.11 




(06) 





First differences 

.14 

1.77 

.08 




(06) 



1872-1913 

World 

Levels 

.43 

.32 

.71 




(.04) 





First differences 

.21 

1.79 

.11 




(.08) 




British 

Levels 

.41 

.28 

.67 




(05) 





First differences 

.24 

1.50 

.14 




(.09) 



1872-1938* 

British 

Levels 

.36 

.40 

.78 




(.02) 





First differences 

.24 

1.57 

.24 




(05) 



1921-38* 

British 

Levels 

.31 

.62 

.58 




(06) 





First differences 

.16 

1.97 

.09 


(. 09 ) 


The work! price ieri« coven only 1821-1915. 
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TABLE 2 


Regression of Logarithm of Price Level on British Open Market Rate of 
Discount (Levels and First Differences) 


Sample Period 
and Price Series 

Levcls/First 

Differences 

Coefficients 

of 

Consol Yield 

Durbin- 

Watson 

Statistic 

fP 

1826-1913. 

World 

Levels 

.07 

(.01) 

.35 

.27 


First differences 

.017 

(.004) 

1.68 

.15 

British 

Levels 

.08 

(.01) 

.36 

.35 


First differences 

.026 

(.004) 

1.82 

.33 

1826-71; 

World 

Levels 

.05 

(01) 

.59 

.22 


First differences 

.02 

(01) 

1.67 

.20 

British 

Levels 

04 

(01) 

.50 

.29 


First differences 

.02 

(.01) 

1.89 

.31 

1872-1913: 

World 

Levels 

.07 

(.03) 

.20 

.14 


First differences 

.012 

(.009) 

1.63 

.02 

British 

Levels 

09 

(.02) 

.19 

.26 


First differences 

.04 

(.01) 

1.76 

.39 

1872-1938: 

British 

Levels 

.14 

(02) 

.25 

.33 


First differences 

.05 

(01) 

1.50 

.22 

1921-38. 

British 

Levels 

.10 

(.03) 

.48 

.51 


First differences 

.05 

(.03) 

1.28 

.10 


Source. —Homer (1977) for interest rare data; price data are described in the text. 


Not only is the relation significant and stable during the peacetime, 
gold standard years from the 1821 resumption of the gold standard 
in Britain to the eve of World War I, but it completely breaks down 
during the Napoleonic war period of 1797-1820, when the gold stan¬ 
dard was abandoned. Over this period, the correlation was negative in 
both levels and first differences. These findings are evidence against 
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theories based on government purchases (Benjamin and Kochin 
1984) or finance (Keynes 1930; Shiller and Siegel 1977) during war¬ 
time, 5 

Finally, the evidence of the Gibson correlation is weaker for the 
pre-Napoleonic period 1730-96 and the interwar years 1921-38 
than for the classical gold standard period. It is plausible to relate the 
weak, but not nonexistent, evidence of the Gibson paradox from 
these periods to the rather restricted functioning of the gold standard 
during them. Although Britain was effectively on a gold standard 
between 1730 and 1796, most of the rest of the world was not. Like¬ 
wise, the post-1921 gold standard was closely “managed” by central 
banks and encumbered by formal and informal restrictions limiting 
convertibility. 


Is There Still a Gibson Paradox? 

An important question, and a frequent source of confusion, is 
whether or not Gibson’s paradox persists into the post-World War 11 
period. Some authors have concluded that it does, on the basis of raw 
correlations in levels. This is inappropriate for a period during which 
the price level rose in every year. To establish an economically mean¬ 
ingful Gibson paradox, one would need to show that when the rate of 
inflation slowed (remaining positive, however), the interest rate con¬ 
tinued to rise with the price level. That this was not the case is clearly 
seen in figure 2. 6 As becomes especially clear after 1965, the interest 
rate follows the rate of inflation rather than the price level. The 
complete disappearance of Gibson's paradox by the early 1970s coin¬ 
cides with the final break with gold at that time. 

The results of this and the preceding subsection corroborate the 
view of Friedman and Schwartz (1982) that Gibson’s paradox is 
largely, or perhaps solely, a gold standard phenomenon: “For the 
period our data cover [ 1880-1976] it [Gibson’s paradox] holds clearly 
and unambiguously for the United States and the United Kingdom 
only for the period from 1880 to 1914, and less clearly for the inter¬ 
war period” (p. 586). 


5 Robert Barro, in private communication, informed us that a positive relationship 
between temporary government purchases and real interest rales holds throughout the 
nineteenth century, but a relationship between government purchases and the price 
level holds only for periods of suspension. 

* Figure 2. which shows the 3-month Treasury bill rate alongside the level of the 
Consumer Price Index (CPI) and a 6-month moving average of inflation, extends a 
similar chart presented in Friedman and Schwartz (1976). 
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Fio. 2.—Gibson's paradox vs. the Fisher effect, 1955-84 


Gibson’s Paradox and Real Rates 

In attempts to rationalize Gibson’s paradox, it is critical to determine 
whether it applies to real, or merely to nominal, rates of return. 
Sargent (1973), Shiller and Siegel (1977), Benjamin and Kochin 
(1984), and Barsky (1987) provide some indirect evidence that the 
paradox held for real rates by arguing that observed nominal rates 
entirely reflected ex ante real rates and did not embody inflation 
premia. They show that inflation was serially uncorrelated during the 
Gibson paradox period, and thus the univariate properties of infla¬ 
tion alone did not justify the inclusion of an inflation premium in 
nominal rates. 7 Shiller and Siegel (1977) further demonstrate that 
nominal consol yields had no predictive power for subsequent long¬ 
term inflation, as they should have if nominal yields were set in antici¬ 
pation of future inflation. 

7 Barsky (1987) considers the possibility that inflation was significantly more forecast¬ 
able with a larger information set. While it is impossible to give a fully satisfactory 
treatment of this issue with the limited data available, it remains true that there is little 
evidence of a rational Fisherian premium in nominal interest rates prior to 1914. 




gibson’s paradox 537 

Here we follow Sargent (1973) in presenting more direct evidence 
by studying equity yields. We examine both earnings/price and divi¬ 
dend/price ratios. Both should be proxies for long-term required real 
returns. The former have the virtue of reflecting any expected capital 
gains associated with retained earnings, while the latter are measured 
more accurately and are less likely to be distorted by transitory fluctu¬ 
ations in profits. An obvious alternative procedure would be the use 
of ex post equity returns. We did not use this procedure because prior 
analysis of the data indicated that almost all the variation in ex post 
returns reflects news rather than changes in ex ante returns. Fried¬ 
man and Schwartz (1982) do briefly examine holding period returns 
with inconclusive results. 

We rely on the composite dividend and earnings yields given in 
Cowles (1939). These data are available only after 1871. Table 3 re¬ 
ports regressions involving the yields, in levels and first differences, 
comparable to the regressions in table 1. The coefficients of the re¬ 
gressions in levels are always positive, with conventional /-statistics 
typically between 5 and 15. The dividend yield regressions for 1872- 
1913, in particular, exhibit large conventional (-statistics. The esti¬ 
mated coefficients do tend to be smaller than those from regressions 
using the bond yield. The first-difference regression coefficients are 
significantly positive in half of the cases and insignificant in the re¬ 
mainder. Overall, the regressions support the view that Gibson's 
paradox involved the real rate. 

It might be argued that fluctuations in dividend/price and earnings/ 
price ratios reflect movements in the level of future dividends and 
earnings relative to current values. As a check against this possibility, 
we followed the procedures used in Blanchard and Summers (1984) 
and calculated the required rate of return necessary to justify the 
market’s value given autoregressive projections of future dividends. 
The results of correlating this internal rate of return series with the 
price level are also displayed in table 3. The results parallel those 
obtained with the cruder dividend and earning yield proxies for real 
returns. 


Summary 

We conclude this section with a summary of the empirical findings 
about Gibson’s paradox that theory should seek to explain. (1) There 
is a Gibson paradox that is more than a spurious correlation between 
two random walks. (2) Far from being primarily a wardme phenome¬ 
non, Gibson’s paradox characterizes the gold standard years 1821- 
1913, and those years represent the only long period over which the 



TABLES 


Regression of Logarithm of Price Level on Cowles Commission Stock Yields 

(1872—ISIS) 




Coefficient 

Durbin* 


Price 

Levels/First 

of 

Watson 


Series 

Differences 

Stock Yield 

Statistic 

IP 


Dividend/Price Ratio Yield 


World 

Levels 

.14 

(01) 

1.17 

.74 


First differences 

.03 

(.01) 

1.84 

.12 

British 

Levels 

.13 

(.02) 

,70 

.61 


First differences 

.03 

(.01) 

1.43 

.11 

U.S. 

Levels 

.12 

(02) 

.82 

.44 


First differences 

-.02 

(02) 

1.61 

.01 


Earning/Price Ratio Yield 


World 

Levels 

.07 

(.01) 

.61 

.38 


First differences 

-.01 

(01) 

1.26 

.01 

British 

Levels 

.08 

(01) 

.59 

.52 


First differences 

.02 

(.01) 

1.34 

.09 

US. 

Levels 

.08 

(.01) 

.79 

.45 


First differences 

.00 

(.01) 

2.14 

-.02 


Internal Rate of Return Yield 


World Levels .08 .53 .67 

(.01) 

First differences .02 1.75 .0' 

( 01 ) 

British Levels .07 .40 .5' 

(. 01 ) 

First differences .02 1.42 .0. 

(.01) 

U.S. Levels .07 .31 .4 

(.01) 

.02 
(.01) 


First differences 


1.56 


.0 





gibson’s paradox 539 

correlation held continuously. Gibson’s paradox had clearly vanished 
by the 1970s. (3) The paradox appears to involve the real rate. Re¬ 
gressions using the Cowles stock yield data suggest that the price level 
was correlated with the expected return on capital. 

II. A Theory of the Real Price of Gold and the 
World Price Level 

This secdon develops a simple model of the determinadon of the real 
price of gold, and hence the general price level, under a gold stan¬ 
dard. We then discuss the response of the model to a disturbance to 
the real rate of return. Formally, the model describes a closed, full- 
employment economy, which is best thought of as the world economy 
under fixed exchange rates and fully flexible prices. The model is 
very close to that of Barro (1979), except that it drops his partial 
adjustment formulation and emphasizes explicidy the role of the real 
interest rate. 

For our purposes, a gold standard is defined as the maintenance of 
full convertibility between gold and dollars at a fixed ratio. The gold 
backing of the money stock need not be one for one. Money consists 
of bank deposits, and, for simplicity, there are no gold coins. The 
fixed nominal price of gold implies that determining the general price 
level is equivalent to determining the equilibrium relative price of 
gold. We set the nominal price of gold equal to unity for convenience. 
The real price of gold is then P g = l IP, where P is the general price 
level. 

Gold is a highly durable asset, and thus, as stressed by Levhari and 
Pindyck (1981), it is the demand for the existing stock, as opposed to 
the new flow, that must be modeled. The willingness to hold the stock 
of gold depends on the rate of return available on alternative assets. 
We assume that the alternative assets are physical capital and bonds, 
both earning a real return of r. The real rate of return is exogenous to 
the model but subject to shocks. These shocks reflect changes in the 
actual or perceived productivity of capital as envisioned by Keynes 
and Wicksell. Interpreted as shocks to the production function, these 
sorts of disturbances also play a crucial role in the equilibrium real 
business cycle models surveyed by Prescott (1986). 

The Model 

The gold stock G is held in two forms: as bank reserves (denoted G*) 
and as nonmonetary gold (denoted G„). Nonmonetary gold (best 
thought of as jewelry, objects of art, etc.) is held partly for its (possibly 
time-dependent) service flow or marginal “dividend,” which is de- 
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noted D(G„, t) with £> G „ < 0. Consumers equate the marginal service 
flow D(G n , t) to the user cost rP g - P g (we assume no depreciation), so 
that at all times the real gold price must satisfy 

P g - rP e - D(G n , t), (1) 

where we have suppressed the time subscripts for convenience. 

The monetary side consists of a conventional demand for real bal¬ 
ances, 

j = L(i, t) = L{r Li< 0. (2) 

and a relation between monetary gold reserves and the money stock, 

M - nG*, (3) 

where p. is a fixed parameter. Finally, 

G m + G„ « G, (4) 

the total existing gold stock. 

Equations (1)—(4) determine the real price of gold as a negative 
function of current and future real interest rates r, and gold supplies 
G, and determine the allocation of gold between monetary and non¬ 
monetary uses at each point in time. 

The price level may rise or fall over time depending on how the 
stock of gold, the dividend function, D(G„, t), and the demand for 
money, L(i, t), evolve over time. Secular increases in the demand for 
monetary and nonmonetary gold caused by rising income levels tend 
to create an upward drift in the real price of gold, that is, secular 
deflation. Tending to offset this effect would be gold discoveries and 
technological innovations in mining such as the cyanide process. The 
fact that the average growth of prices between 1870 and 1913 was 
close to zero might have been something of a historical accident, with 
gold discoveries and mining innovations just offsetting, on average, 
the effects of growth in the nongold sectors (see Barro 1979; Rockoff 
1984). 

For ease of exposition, we focus only on steady states in which the 
demand for gold is stable and its quantity is fixed, 8 so that its real price 


8 For a more elaborate analysis in which gold mining is endogenous, see Barsky and 
Summers (1985). Endogeneity of gold mining does not play a central role in any 
version of this paper, nor does it appear in the work of Lee and Petruzzi (1986). The 
key difference between our theoretical development and theirs is our emphasis on the 
crucial distinction between monetary and nonmonetary gold and on the service Hows 
from nonmonetary gold. Both of these elements are necessary for a satisfactory theo¬ 
retical demonstration of our Gibson paradox result. 
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Fig. 3.—The determination of the gold price 


is constant. In this case (1) becomes 

D(G n ) = rP g , D'(G n ) < 0, (D 

and (2), (3), and (4) together yield 

L(r) - MG - G„)P g , L'(r) < 0. (5) 

Figure 3 shows graphically the determination of equilibrium. The 
real price of gold (inverse of the general price level) appears on 
the vertical axis, and the nonmonetary gold stock is measured along 
the horizontal axis. The negatively sloped G n G„ locus, which repre¬ 
sents equation (1'), shows that as the real price of gold falls (for given 
r), the desired stock held for nonmonetary purposes rises. The up¬ 
ward-sloping G m G m locus, which represents equation (5), shows that 
(for a given total gold stock) an increase in nonmonetary holdings is 
consistent with monetary equilibrium only at a higher P g (lower gen¬ 
eral price level). This is because the increase in G„ comes at the ex¬ 
pense of monetary holdings, creating an excess demand for real bal¬ 
ances at the initial price level. 
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Response to art Increase in the Real Interest Rate 


Differentiating the system (1') and (5) with respect to the real rate of 
return r yields® 


dP e = —D'(G n )r\)iG m Pgr ~ 1 - jiPg ^ Q 
dr -D'(G„)pG., + nsP g 


( 6 ) 


dG m _ (1 + t\)tiPgG m .y. 

dr ~ -D'(G„)pG m + r\iP g ' 

where n < 0 is the interest elasticity of real money demand. Equation 
(6) shows that, as illustrated in figure 3, an increase in the real rate 
unambiguously reduces the relative price of gold P r Since the general 
price level is the inverse of P e shocks to the real rate lead the interest 
rate and the general price level to covary positively, the Gibson phe¬ 
nomenon. The economic mechanism is clear. Increases in real inter¬ 
est rates raise the carrying cost of nonmonetary gold, reducing the 
demand for it. They also reduce the demand for monetary gold as 
long as money demand is interest elastic. The resulting reduction in 
the real price of gold is equivalent to an increase in the general price 
level. 

Equation (7) demonstrates that a rise in r will cause the monetary 
gold stock to increase at the expense of the nonmonetary stock unless 
the interest elasticity of demand for real balances exceeds unity in 
absolute value. Available empirical estimates suggest that this elastic¬ 
ity is much smaller than unity (see Friedman and Schwartz 1982). 
Thus the rise in prices will be associated with an increase in the mone¬ 
tary gold stock, in accordance with Cagan’s stylized fact. The rise in 
velocity further reinforces the Gibson correlation, but (6) shows that 
Gibson's paradox continues to hold for = 0. 

The mechanism we have highlighted provides a possible explana¬ 
tion of Gibson’s paradox that, like the Keynes-Wicksell explanation, 
accounts for the observation that the correlation held for real rates of 
return but avoids the principal difficulty with earlier resolutions: their 
counterfactual dependence on changes in the money multiplier or the 
central bank's gold ratio. Our approach also accounts for the coinci¬ 
dence of the Gibson paradox observation and the gold standard. 

An important limitation of our theory is that the Gibson correlation 
arises only from shocks to the real rate. A gold discovery would, of 
course, take the form of an increase in G, for all t greater than some to- 
In our model, this leads to a permanent increase in the price level 
without a corresponding change in r. Shocks of this sort weaken the 


e Note that this correspond!! to examining the effect of an equal change in all forward 
interest rates. Empirically, we think of a change in long-term real interest rates. 



gibson’s paradox 543 

Gibson correlation, but it is important to understand that gold discov¬ 
eries do not contribute correlations opposite in sign to that of Gibson. 
The strength of the observed Gibson correlation will depend on the 
relative importance of gold discoveries and shocks to real interest 
rates. 

III. Real Interest Rates and the Relative Price of 
Gold, 1975-84 

The model of the price level under a gold standard in Secuon II is 
essentially a theory of the relative price of gold. Omitting the mone¬ 
tary demand for gold, we see that the theory continues to hold in the 
same fashion. Thus an important test of the model is to see how well it 
accounts for movements in the relative price of gold (and other met¬ 
als) outside the context of a gold standard. The properties of the 
inverse relative prices of metals today ought to be similar to the prop¬ 
erties of the general price level during the gold standard years. We 
focus on the period from 1973 to the present, after the gold market 
was sufficiently free from government pegging operations and from 
limitations on private trading for there to be a genuine “market” price 
of gold. 

In order to study long-term real rates in recent years, we require 
forecasts of inflation over a horizon appropriate to a long-term 
bond. 10 Box-Jenkins analysis suggests that the inflation rate in the 
1970s is well modeled as an IMA(1, 1) process, resulting from a mix¬ 
ture of permanent and transitory shocks (Muth 1960). This yields the 
same A-step-ahead forecast for all horizons k =* 1 (see Sargent 1979, p. 
265), which has an interpretation as “permanent expected inflation.” 
The forecasts are based on a “rolling autoregressive integrated mov¬ 
ing average (ARIMA)” procedure so that only information available 
as of the forecast date is used. 

Figure 4 displays the (log) inverse real gold price and our estimate 
of the expected pretax real interest rale. The strong comovement 
over the longer cycles is reminiscent of Gibson’s paradox. Variation in 
the real interest rate appears to be responsible for much of the year- 
to-year movement in the relative price of gold. After 1980, inflation 
exhibits increased volatility, and the ARIMA forecast is less satisfac¬ 
tory. Some of the variation in our proxy for the expected real yield on 
bonds ought to be regarded as spurious. Also, it is clear that, from 
1980 onward, the relative price of gold is higher for any given real 

10 Instead of relating the relative price of gold to the real interest rate, Lee and 
Petruzzi (1986) construct a “gold-denominated" interest rate using futures market data. 
Our analysis in Sec. II of this paper points to the real interest in terms of goods, which 
we use in our empirical work. 
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Fig. 4.—The (pretax) real interest rate and the inverse relative price of gold 


interest rate than it was during the 1970s. Real interest rates are not 
the only determinant of the relative price of gold. Yet the impression 
that real rates were high after 1981 and that these high rates were 
associated with a low relative price of gold vis-4-vis the 1980 level is 
unmistakable. 11 

A regression of the log real gold price on our long-term real inter¬ 
est rate, allowing a separate constant term for the post-1980 period, 
yields 


11 One might wonder whether this conclusion would be overturned by considering a 
“world" real gold price. We constructed one, using the trade-weighted real exchange 
value of the dollar series supplied by the Federal Reserve. Through 1982, the results 
were almost identical. In the following 3 years, the large real appreciation of the dollar 
caused the dollar real price of gold to be considerably lower than the world real price. 
Note, however, that real interest rates were considerably higher in the United States 
than in the rest of the world. 
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1974 1976 1978 1980 1982 1984 

Fig. 5. —The (pretax) real interest rate and the inverse relative price of nonferrous 
Tietals. 


logf Go |d Price \ = 4 54 + 84 Post , 980 _ 06 Rea , R 

' ' (.03) (.06) (.01) 

R 2 = .81, Durbin-Watson = 1.13. 

The data strongly reject constancy of the intercept term before and 
tfter 1980. The slope estimate, however, is quite stable across sub- 
jeriods. Alternative specifications, not reported here, using an after- 
ax real interest rate variable provided even stronger evidence of 
interest sensitivity of gold prices. 

To ensure that we are capturing the general tendency of increases 
n real interest rates to depress the prices of durable assets, and not 
ome peculiarity of the gold market, we also examine the behavior of 
other metals prices. Figure 5 shows our long-term real interest rate 
triable along with the level of the producer price index of prices of 
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nonferraus metals relative to the CPI. The results are, if anything, 
even more striking than those for gold, providing further support for 
the asset-pricing approach to metals prices. 

IV. Monetary and Nonmonetary Gold Stocks and 
Productivity Shocks 

Two important elements of our model are productivity shocks and 
substitution between monetary and nonmonetary gold. Here we pre¬ 
sent some indirect evidence that these elements of our model were 
plausibly present during the Gibson paradox period. 

We have already noted that the movements in the consol yield 
cannot be attributed to expected inflation and that this is true for 
stock yields as well. It would be almost impossible to attribute the 
changes in yields to monetary factors. Indeed, that would tend to 
imply a negative correlation between interest rates and prices. It ap¬ 
pears that real shocks to productivity or thrift are the likely sources of 
movements in both the consol and the stock yields. 

Some evidence of productivity shocks in particular can be inferred 
from the financial markets. In the absence of news about increased 
profitability, increases in the long-term real interest rate would cause 
stock prices to fall. In that case we would expect a strong negative 
correlation between changes in the bond yield and stock price 
changes. In fact, the correlation was essentially zero. For 1872-1913, 
the correlation between log changes in the consol yield and log 
changes in the Cowles (1939) composite real stock price index was 
-.06. The corresponding number for the U.S. railroad bond yield 
from Macaulay (1938) was .12. 

Monetary and Nonmonetary Gold 

The canonical model of the world price level under a gold standard, 
as developed by Friedman (1953) and Barro (1979) and reflected in 
Section II of this paper, involves substitution between monetary and 
nonmonetary gold as a prominent feature. Much informal discussion 
of the gold standard, however, appears to embody the presumption 
that the production of new gold was a sufficient statistic for changes in 
the monetary gold stock and thus ignores the role of changes in the 
desired stock of nonmonetary gold. 

While quantitative data of sufficient quality to justify statistical anal¬ 
ysis of flows between alternative uses of gold do not exist, 12 there is 

11 Kitchin (1931a, 19316) presents some estimates of industrial demand. As Rocitoff 
(1984) notes, these numbers, having been constructed mainly by interpolation, are 
artificially smooth. 
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literary evidence suggesting that the demand for nonmonetary gold 
was an important determinant of the monetary gold stock. Probably 
the best-known student of world gold stocks, Joseph Kitchin (see 
Kitchin 1931a, 19316), wrote: 

For the purpose of the work of this group, the annual addi¬ 
tion to gold money is of more importance than the annual 
addition to the gold output and it is therefore necessary to go 
into the matter of consumption, especially so far as that con¬ 
sumption is the result of demand and is not automatic. When 
new gold is produced and comes into the market, the indus¬ 
trial arts, together with India and to some extent China, lay 
claim to a large proportion of it and the balance, from the 
nature of things, goes automatically to swell the amount of 
gold money. That is, in practice the manufacturers of money 
have no say as to what those additions to their stock should 
be, and no matter whether the balance after the satisfying of 
demand is large or small, the manufacturers of money have 
to accept it, whether they will or no. [1931a, p. 61] 

Writing 50 years earlier, Del Mar (1880) struck much the same note: 
“Upon a general review of the subject it would appear that now, at 
least, not coin, but the arts, are the first and principal attraction that 
determines the distribution of the precious metals, and that it is only 
after the demand for the arts has been satisfied that the supplies of 
specie are permitted to accumulate as coin” (p. 188). 

There is some quantitative evidence that the fraction of the gold 
stock in nonmonetary use was large. Del Mar (1880) presents num¬ 
bers extending back into the seventeenth century that indicate that 
nonmonetary gold accounted for about two-thirds of the total gold 
stock and that this fraction varied over time. The work of Kitchin 
(1931a, 19316), covering a shorter period, suggests smaller numbers 
for the ratio of nonmonetary to total gold, somewhere between a 
third and a half. Edie (1929) attempted a direct count of the gold in 
world central banks in two benchmark years and concluded that 
Kitchin had underestimated the extent of nonmonetary use. In Edie’s 
words: 


During the past fifteen years, the average annual gross prod¬ 
uct of the gold mines has been $392,000,000. This figure is 
derived from reasonably accurate reports to the Director of 
the Mint of the United States. Of this sum, $270,000,000 has 
annually been drawn off into hoarding or the industrial arts, 
leaving only $122,000,000 for monetary use. In other words, 
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only 30 percent has become available as money; the remain¬ 
ing 70 percent has been drawn off into other uses. 

According to this calculation, Mr. Kitchin has credited 
monetary stocks with nearly double the amount of new gold 
which actually has been added to them. [1929, pp. 34-35] 

In summary, a sizable fraction of the gold stock was held in non¬ 
monetary form, and this fraction was not constant over time. We 
therefore conclude that the size of the monetary gold stock was deter¬ 
mined both by the level of gold production and by the allocation of 
gold between monetary and nonmonetary use. 


V. Summary and Conclusion 

The Gibson paradox has proven to be an especially stubborn puzzle in 
monetary economics. We believe that taking account of the role of 
gold as an asset contributes significantly to our understanding of the 
anomaly. Our model accounts for the historical coincidence of the 
Gibson paradox and the gold standard, an observation made by 
Friedman and Schwartz (1982) but not incorporated in previous at¬ 
tempts to rationalize the Gibson phenomenon. Like the resolutions 
proposed by Keynes and Wicksell, ours involves the underlying real 
rate of return, as the data suggest Gibson's paradox did. However, 
our explanation, unlike those of Keynes and Wicksell, does not de¬ 
pend on counterfactual changes in private banks’ reserve ratios or the 
ratio of high-powered money to the monetary gold stock. The price 
level under the gold standard behaved in a fashion very similar to the 
way the reciprocal of the relative price of gold evolves today. Data 
from recent years indicate that changes in long-term real interest 
rates are indeed associated with movements in the relative price of 
gold in the opposite direction and that this effect is a dominant fea¬ 
ture of gold price fluctuations. 

The principal problem with our resolution of Gibson’s paradox is 
the minimai role that it accords gold discoveries. New gold produc¬ 
tion accounted for a significant share of the variation in the price leve 
during the nineteenth and early twentieth centuries. In particular 
the post-1896 rise in prices, after more than two decades of deflation 
is usually attributed to gold discoveries in combination with the devel 
opment of the cyanide process for extraction. If this is correct, th 
comovement of prices and interest rates during the period 1896 
1913 remains something of a puzzle, and thus our proposed resoli 
tion of the Gibson paradox cannot be the whole answer. To reach 
verdict on the quantitative importance of the mechanism in this pap< 
would require better methods for proxying movements in the stocl 
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of monetary and nonmonetary gold, and this might be an appropriate 

topic for further research. 
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The Characteristics Modal, Hedonic Prices, 
and the Clientele Effect 


Larry E. Jones 

Northwestern University 


In this paper, the characteristics model of Lancaster is reconsidered. 
It is shown by example that equilibrium prices need not be linearly 
decomposable. It does follow that equilibrium prices must be a con¬ 
vex function of characteristics, however. Further, it is shown that this 
fact holds independent of the form of firm competition (e.g., perfect 
or monopolistic). Finally, the predictions of the theory are discussed 
in the context of two empirical examples. 


I. Introduction 

The characteristics model of differentiated products (adapted by 
Lancaster [1971] from the work on income distributions by Tin¬ 
bergen [1956] and Mandelbrot [1962]) has had many uses in both 
theoretical and applied economics. One of the most useful aspects of 
the model is the hedonic decomposition of prices that the approach 
affords. That is, the price of a typical good is written as the sum of 
characteristics prices times the levels of the characteristics embodied 
in that good. This has proven quite useful for, among other things, 
adjusting price indices for changes in the qualities of the goods in the 
market basket (see Griliches 1971). 

The purpose of this paper is to present a reexamination of Lancas- 
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tion for their financial assistance. Of course, remaining errors are my responsibility. 
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ter's characteristics model in light of some recent research on models 
of commodity differentiation (Mas-Colell 1975; Hart 1979; Jones 
1984). First, it is shown that the characteristics model can be viewed as 
arising from special restrictions on the allowed preferences within 
these more general models of commodity differentiation. Second, it is 
shown by example that, even in quite reasonable (and robust) circum¬ 
stances, the decomposition of prices mentioned above can fail to hold 
in equilibrium. 

Given this fact, the possibility of positive results concerning the 
form of equilibrium prices as a function of characteristics is explored. 
It is shown that even though price linearity may fail, prices are a 
convex function of characteristics in equilibrium (proposition 1). Fur¬ 
ther, if all individuals have the same homothetic utility function over 
characteristics (but possibly different incomes), it is shown that equi¬ 
librium prices can be linearly decomposed through the hedonic tech¬ 
nique (proposition 3). It is also shown (propositions 2 and 4) that 
these two results concerning the form of the equilibrium price func¬ 
tion hold independent of the underlying market structure (e.g., per¬ 
fect vs. monopolistic competition) as long as consumers act as price 
takers. Finally, it is shown that both linearity and (strict) convexity of 
the price function are robust cases. Similar results hold for input 
markets as long as the purchasing firms are price takers. 

The fact that equilibrium prices cannot generally be linearly de¬ 
composed is a result of boundary problems arising in the consumer’s 
maximization problem within the Lancastrian approach. This obser¬ 
vation in and of itself is not new (see. e.g., Muellbauer 1974). These 
earlier authors were concerned primarily with consumers’ decisions, 
however, and did not explore the impact of these considerations on 
equilibrium prices. In addition, the relevance of the underlying mar¬ 
ket structure was not discussed. In essence, the problem arises be¬ 
cause (as stressed by Mandelbrot) characteristics cannot be unbundled 
from goods. 

In Section II, notation is introduced and a simple example of the 
model is presented in which prices are not linear in characteristics. 
The positive results concerning the form of equilibrium prices are 
presented in Section III. Section IV contains an analysis of price data 
on multiple vitamins as an empirical test of the predictions of Section 
III and a discussion of some related work on labor markets. Finally, a 
few concluding remarks are offered as Section V. 

II. Some Notation and an Example 

After first setting out some notation, I will present an example in 
which prices cannot be linearly decomposed. 
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In the economy considered here, goods are completely described 
by the levels of the various characteristics they possess. There are J 
characteristics of interest to consumers indexed by j. Thus a good can 
be described as a point, t, in T = I\ x / s x ... x l Jt where Ij C R+ 
describes the possible levels of the jth characteristic a good can have. 

Nodce that under this formuladon goods with the same characteris- 
dcs are identified. Since the consumers will act as price takers and care 
only about the total amount of characteristics they receive, goods with 
the same characteristics must fetch the same price (if they are sold at 
all). 1 

Individuals choose consumption bundles, which we will model as 
nonnegative distributions (measures) on T. This is just the familiar 
notion of probability modified to allow for total mass unequal to one. 
The collection of nonnegative measures on T will be denoted by M, 
and a typical consumption bundle will be written as m. Thus we will 
follow the approach to consumer choice introduced in Mas-Colell 
(1975) and used in Hart (1979) and Jones (1984). The advantage of 
this approach is that it allows for consumers to specialize by choosing 
distributions concentrated on a few goods or to generalize by choos¬ 
ing a distribution with a density. 

Notice that modeling consumption in this way assumes that goods 
are perfectly divisible; see the related comments in Section V. 

If a consumer purchases the commodity bundle m, the total amount 
of the Jth characteristic he consumes is given by 

Cj(m) = ^ tjdm(t). 

(This is the Lebesgue-Stieltjes integral as in probability theory.) 
Equivalently, if rn } is the marginal distribution of m on the jth compo¬ 
nent of t, 

Cj(m) = H sdmj{s). 

The essence of the characteristics approach is that individuals rank 
the various consumption bundles through the total amount of the J 
characteristics. That is, the preferences of individual A are given by 

V*(m) = C/*(c,(m), . . . , cj(m)), (1) 

where {/*(•) is a standard utility function over . 

Thus the preferences considered will be restricted to be of the 
linearly combinable variety. 2 That is, as can be seen from the defi- 


1 This is not simply a theoretical possibility in these models since they are often used 
to generate theories of the equilibrium level of product diversity (cf. Lancaster 1975). 
* I am indebted to Sherwin Rosen for suggesting this terminology. 
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ration of Cj, in the preferences we will allow, the characteristics of the 
individual goods are linearly combined into an aggregate characteris¬ 
tics bundle (ci(m), . . . , c/m)), which determines the utility of the 
consumption bundle. 

We will assume that [/* is strictly increasing, strictly concave, and 
twice continuously differentiable. 

We can now define the marginal value to h of increased consump¬ 
tion of a good with characteristics t when h is consuming m. We will 
call this h’a marginal utility of t at m, MU\m\ t). When utility functions 
are as in (1), 

J 

MU k (m; 0 * X • • • • c A m )) • l i ~ gr ad U h * t, (2) 

where ifj is the jth partial derivative of U h . This is the directional 
derivative of the utility function V h in the direction of commodity t. 

Note that M U h is linear in t. This provides a theoretical basis for the 
hedonic decomposition of prices. That is, if, in equilibrium, the Uj's 
are proportional across households, (2) describes equilibrium prices 
as well. Thus prices are linear and the Uf's give the “characteristics 
prices” in this case. 

The problem of guaranteeing that prices are linear in characteris¬ 
tics is thus reduced to finding conditions under which the U$’s are 
proportional in equilibrium. One would like to argue that if this were 
not the case, mutually advantageous trades in characteristics can be 
made as long as all individuals are allotted positive amounts of all 
characteristics. This argument would lead one to believe that, in equi¬ 
librium, the Uj's should be proportional other than in the exceptional 
cases in which some agent has a zero allotment of some characteristic. 
Unfortunately, this argument is incorrect. The problem with it is that 
agents cannot trade characteristics direcdy; rather they must trade 
characteristics bundled as goods. Thus it may not be possible to find 
mutually advantageous trades even though marginal utilities of char¬ 
acteristics are not proportional and all agents are consuming positive 
amounts of all characteristics. This occurs because of what is known as 
(in the context of differential tax treatment) the clientele effect in the 
finance literature. That is, consumers divide themselves by groups, 
;ach group buying a different collection of commodities. The follow¬ 
ing example illustrates this phenomenon. 


Example 

Consider a two-consumer world with no production in which goods 
have different levels of two characteristics. It is useful to think of the 
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various goods as different foods described by their contents of pro¬ 
tein and vitamin A. Here, J = 2: steak is represented by a t with a 
large first component and carrots one with a large second component. 

Let 7i = 1 2 - [0, 1], and suppose that the aggregate social endow¬ 
ment is given by the uniform distribution on T = [0, 1] x [0, 1], that 
is, the distribution with density equal to one everywhere in T. Let B x 
be those commodities in T with larger first components than second 
and Bg those with larger second components. 

Consider the allocation that gives the first household the social 
endowment on B i and gives the second household the social endow¬ 
ment on B 2 . Denote these by mi and m 2 , respectively. 

Suppose that the utility functions for the two individuals are such 
that 

grad U x {ci{m{), c 2 (m x )) = (2, 1) 

and 

grad f/ 2 (c,(m 2 ), e 2 (m 2 )) = (1, 2). 

In this case, (m u m 2 ) is a Pareto-optimal distribution of the social 
endowment, even though both agents are assigned positive quantities 
of both characteristics and the marginal rates of substitution between 
characteristics of the two disagree. 

To see this, consider a trade in which V] is taken from the first agent 
and given to the second and v 2 is taken from the second agent and 
given to the first. By design, the first agent gives up more of the first 
characteristic than the second and the second agent gives up more of 
the second than the first. Thus, after completion of this trade (which 
leaves the first agent with mi — V( + v 2 and the second with m 2 — v 2 + 
vj), the first agent has more of the second characteristic than he began 
with and less of the first. Let A' and A 2 represent the changes in the 
levels of consumption of the two characteristics by the first agent, and 
let A? and A 2 be the corresponding quantities for the second agent. 
Then, for the first agent's welfare to be improved by the trade, it must 
be true that A 2 > 2A|. Similarly, if the second agent's welfare is to be 
improved. A? > 2A|. Both of these inequalities cannot hold, however, 
since A? = A} and A 2 - A 2 . 3 


3 Note that in this example any trade offered can be considered as being between two 
goods. For example, for both agents, vi (defined above) is equivalent to a trade of v t (T) 
units of the good with characteristics 



where 1* * (<*, <*) is in Bi since By is convex and I* is the mean of the probability 
[1/V|(T)]V| on By. Similar reasoning holds for v 2 . 
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Thus (i»i, m 2 ) is indeed Pareto optimal. Hence, it can be supported 
as the equilibrium of an exchange economy. The supporting prices 
are given by p*(t) = max(2fi + t 2 , I, + 2 1 2 ) - max[M£/(mi; t), 
<)]• For this to be an equilibrium, any assignment of endow¬ 
ments such that ei + = mj + m 2 and r A • p* = m h - p*, k = 1,2, will 

do; for example, e* = m h or e\ - e 2 = + n 2 ). 

To understand the reasoning behind the example better, see figure 
1. Depicted is the Edgeworth box in characteristics space following 
Rosen (1983). The point (c lf c 2 ) corresponds to the aggregate social 
endowment of characteristics, that is, ('/z, Vi) in the example. If char¬ 
acteristics were not bundled in goods, every point in the box would be 
available. When the “bundling constraints” are imposed, the “true” 
possibilities are seen. In the example, this is given by the lens-shaped 
region ABCD. The point B corresponds to the initial endowment, and 
our assumptions on the gradients imply that the indifference curves 
are as depicted. Thus the contract curve in characteristics space, AEC, 
lies outside the feasible set. This implies that, in equilibrium, the 
marginal rates of substitution of the two agents’ characteristics must 
differ and hence prices cannot be linear. 
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Note that this occurs even though both agents are consuming posi¬ 
tive amounts of both characteristics. Thus, although it is a boundary 
problem that is causing the nonlinearity, it is in goods, not character¬ 
istics. 

One might think that by putting sufficiently strong restrictions on 
the form of the U h 's we could guarantee that all goods are consumed 
by all agents and thus obtain the desired linearity of prices in terms of 
characteristics. One is tempted to do this by making all goods “essen¬ 
tial” as in Cobb-Douglas-type utility functions. This is not possible 
with utility functions of the type exhibited in (1), however. Indeed, it 
cannot be done when preferences depend only on the total levels of 
the characteristics, as we have assumed. The only way to obtain this 
effect is to go to preferences of the additively separable variety as seen 
in the continuous-time growth literature. As can be readily seen from 
that literature, linear prices should not be expected in economies with 
this type of preference structure. 

Before we proceed, three comments are in order. 

First, many models of product differentiation feature consumption 
sets that have indivisibilities as an essential feature (e.g., Rosen 1974; 
Mas-Colell 1975). In fact, one of Rosen’s principal criticisms of the 
characteristics model is that it assumes perfectly divisible goods. In¬ 
troducing indivisibilities would not obviate the problem mentioned 
above in any way, however. 4 In fact, it would nullify some of the 
results to be presented below. The problem is that, because of the 
bundling of characteristics in terms of goods, there are not enough 
trading possibilities. Adding indivisibilities only exacerbates this prob¬ 
lem. 

Second, a natural question to ask is whether or not the inclusion of 
production will restore the desired price linearity. Certainly there are 
assumptions on technology that will give rise to the desired result, but 
these are of the most ad hoc nature (constant returns to scale with l = 
(t i, . . . , tj) requiring units of input 1, t% units of input 2, etc.). 
Beyond this, one would have to expect any such efforts to fail. After 
all is said and done, the cause of the problem is that different agents 
consume different goods, causing much of the usual marginal analy- 


4 Rosen's work shows that this is true. His second main objection to the characteristics 
model is one of realism. He points out that owning two 6-foot cars is not the same as 
owning one 12-foot car. Although this is clearly true, it really has nothing to do with the 
divisible vs. indivisible issue. The example of cars is one in which utility does not 
depend only on total quantities of characteristics. If Rosen's framework were extended 
to allow for the possibility of the purchase of more than one car, more general prefer¬ 
ences of the variety used in Mas-Colell would have to be allowed in order to overcome 
this second (and quite valid) objection. 
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sis to break down. There is no reason to believe that the inclusion of 
production would serve to keep this from occurring. 

Finally, within the finance literature there is a natural assumption 
that will restore the linearity of prices: allowing unlimited short sell¬ 
ing. As can be readily seen, this will serve to equate the appropriate 
marginal rates of substitution so that (2) is indeed valid (see Ross 
1983). 

Let us turn now to a discussion of what can be said about the form 
of price functions when preferences are of the form given in (1) 
above. 

III. Results 

Let us begin our discussion of the equilibrium properties of prices 
with an examination of a perfectly competitive economy with no pro¬ 
duction. 

Suppose that there are H households indexed by k, with utility 
functions as described in (1) above. Each household is endowed with a 
distribution, e h in M, over T. Then e* = 2* e h is the aggregate social 
endowment. The collection of available commodities will be denoted 
by S. This is the support of e*, S = supp e*, and is the smallest closed 
subset of T having full e* mass. 

Assume that S is bounded and that zero is not in S. (This is for 
technical reasons concerning tnonotonicity of preferences.) In this 
case, it has been shown (see Jones 1984) that there is a competitive 
equilibrium with prices that are continuous on S. Let p be such a 
continuous equilibrium price function. 

Proposition 1. Under the assumptions outlined above: (i) p is con¬ 
vex on S; that is, if t t , t 2 , and odj + (1 - afa, 0 *£ a =£ 1, are in S, 
p[ati + (1 — a)t 2 ] 35 ap(ti) + (1 - a)p(t 2 ). (ii) p is linearly homoge¬ 
neous on S; that is, if t and | it are in S for some P > 0, p(fit) - $p(t). 
That is, p is the restriction to S of a convex and linearly homogeneous 
function. 

Proof. Let 5* = supp m h be the collection of commodities purchased 
by household h. Then a straightforward argument along those given 
in Jones (1984) shows that for each h there is a constant \ h > 0 such 
that 

— 

with equality if t is in S*. From this it follows that 
p{t) ** max |~-M(/ / ‘(m A ; oj 
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for all t in S. Since the S h, s exhaust S, equality holds for some h and 
p(t) = max j-- MU h (m h ; t)J. (3) 

Since max is a convex function and MU k is linear in t, part i follows. 

The argument that part ii holds is equally straightforward. 

As can be readily seen from the proof, the conclusion of the propo¬ 
sition holds as long as MU h {m\ t) is a convex and linearly homogeneous 
function of t for all m. The characteristics model is a special case in 
which this holds. Further, if Ml/ 1 is convex for all h, p will be convex; 
if MU* is linearly homogeneous for all h, p will be as well. Clearly, any 
property preserved by the max operation will hold under aggrega¬ 
tion. 

Convexity of the price function implies that prices are lower away 
from the boundary of T than would be expected if the price function 
were linear. Intuitively, this is because there is more competition for 
goods near the center in characteristics space. That is, producers in 
the interior compete not only with producers of nearby goods but 
with producers in the boundary as well since their customers can 
satisfy their needs by buying from combinations of boundary pro¬ 
ducers. 

Part ii of the proposition is interesting. Originally, one’s intuition is 
that, given the structure of preferences, there are really only J goods, 
namely, the various characteristics. The example presented in Section 
II shows that this is incorrect. In fact, there are infinitely many com¬ 
modities in that example. However, ii shows that there is really only 
a one-dimensional family of commodities, not two. In general, the 
model can be reduced to one of a / - 1-dimensional set of com¬ 
modities. These are conveniently summarized by the relative propor¬ 
tions of the various characteristics. Note that if there are indivisi¬ 
bilities in consumption sets, neither i nor ii need hold. 

Conditions i and ii above are virtually the only restrictions one can 
place on prices in these models. In fact, by allowing a continuum of 
consumers, one can construct examples in which any convex and 
linearly homogeneous function is generated as equilibrium prices for 
an exchange economy. 

The problem with proposition 1 as stated is that it is tied too heavily 
to competitive equilibria. Really, these properties arise solely because of 
the price-taking nature of the households considered here. Because 
of this, the results are more general than they first appear. In fact, as 
is evident from Heckman and Scheinkman (1987), the results will 
hold as long as the distribution of goods across buyers is efficient. 

To see this, suppose that S is a finite set (this would be expected if 
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there are any scale economies in production) and iet p{- ) be a price 
function on 5. 

Proposition 2. Suppose that m H maximizes V* on h 's budget set 3* 
= {mjw ■ p ti/*} and supp(S A m A ) = S. Then/) is the restriction to S of a 
convex and linearly homogeneous function. 

The proof is the same as that of proposition 1. 

Thus we see that these two qualitative characteristics of prices fol¬ 
low simply because of the assumption of price-taking consumers with 
preferences of the form in (1). Hence, any model of firm competition 
with these features will give rise to prices (for traded goods) with the 
properties listed. 

Note that, from the propositions, a qualitative prediction about the 
success of linear regression applied to these models is possible. We 
would expect that the model would systematically underpredict the 
prices of goods in the extremes of T and overpredict the prices of 
those goods with significant levels of all characteristics. This is the 
essence of the test to be performed in Section IV. 

Let us turn now to a discussion of a case in which the linear decom¬ 
position of prices does hold. Although the assumptions are strong, 
they may serve as a useful approximation in some cases. In addition, 
the result gives further insight into how the linearity of prices can fail. 
Let us return to the situation considered in proposition 1. 

Proposition 3. Suppose that there is some U: R£ -* R such that (i) 
if" = U for all h (hence U is twice continuously differentiable, strictly 
concave, etc.) and (ii) U represents homothetic preferences over char¬ 
acteristics; that is, i/(xi) ** U(x 2 ) implies t/(axj) = i/(ax 2 ) for all a > 0. 
Then p is linear in t. 

Proof. As in proposition 1, let m h be h's equilibrium allocation and 
define uf = p • m h . Let 

A(w) = {(ci(m), .... Cf(m))\p ■ m ^ w\. 

This is the collection of characteristics bundles that an agent with 
income w can afford at prices p. 

It is straightforward to show that A(w ) is compact and convex. 
Further, A(tw) = tA(w). 

Let c(m h ) - (ej (m*),.... cj(m h )). Then it is easy to see that c(m h ) is the 
unique (by strict concavity of U) maximizer of U on A(w h ). Further, 
since A(tw) = tA(w) and U is homothetic, it follows that c(m h ) - 
(tr*7w*)c(**) for aU A, h’. When the homotheticity of U is used, it 
follows that 

grad U h (c(m h )) - a grad U h \c{m h ')) 


for some a > 0. 
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The result now follows from (3) and the fact that u/* > 0 for all h. 

One might think that by mimicking this proof the homotheticity of 
U could be dropped. That is, if all agents have the same utility func¬ 
tion over characteristics, the c(m h ) all lie on the U income expansion 
path in characteristics space and so the same argument should work. 
In fact, this argument is not valid since A(w) is not necessarily of the 
form of usual budget sets. That is, if it = grad U(c(m h )) and if c* is on 
the ir income expansion path for U at w = an/ 1 , it need not be true 
that c* G A(w). Since A (ait/ 1 ) = a.A(ui h ), c* is in A(w) if preferences 
over characteristics are homothetic because c* - ac(n h ) and c(m*) e 
Ain/ 1 ). 

The assumption that the U h ’s are the same is easily seen to be 
essential. In fact, the example given in Section II is quite consistent 
with the two agents’ having homothetic, but different, utility func¬ 
tions. 

Of course, proposition 3 can be extended to cover other market 
structures in much the same way that proposition 1 was extended. We 
have the following proposition. 

Proposition 4. Suppose that S is finite and that the h’ s satisfy the 
assumptions of proposition 3. If m h maximizes V* on p* = {m\m • p < 
1 /} with a/* > 0, p is the restriction to S of some linear function on T. 

IV. Empirical Examples 

The Vitamin Market 

The considerations raised in the previous two sections concerning the 
possibility of nonlinearity in the price function would be of only lim¬ 
ited interest if prices were in fact linear in all examples in which the 
model is applicable. In this section, in an attempt to address this issue, 
I will discuss an example for which the characteristics model seems a 
very good approximation. There are many examples in the literature 
of empirical estimates of hedonic price functions. Rather than try to 
present a comprehensive list here, I direct the reader to Griliches 
(1971). 

The example is that of nonprescription multiple vitamins. A data 
set consisting of 277 observations was constructed. Manufacturers’ 
specifications listed as ingredients 24 separate vitamins and minerals. 
Of the 277 observations, 163 of the vitamins contained only one of the 
ingredients while the remainder were combinations of two or more of 
the ingredients. 

Let t, be the vector (in R 24 ) consisting of the quantities of the vita¬ 
mins and minerals in a single tablet of the tth observation and let p, 
denote its price. The two models considered are the following. 
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TABLE I 


Ingredient (Units) 

Coefficient 

Estimate ((/Vntt) 

t 

Vitamin A <IU) 

.00009 

2.539 

Vitamin C (mg) 

.00759 

16.531 

Vitamin B t (mg) 

.03356 

2.871 

Vitamin B* (mg) 

.04777 

3.935 

Niacin (mg) 

.00880 

1.925 

Choline (mg) 

.00596 

1.370 

Vitamin B s (mg) 

.02850 

10.972 

Vitamin B l: (meg) 

.000009 

2.958 

Vitamin D (1U) 

.00064 

.766 

Vitamin E (1U) 

.02380 

22.355 

Folic acid (meg) 

.00325 

2.243 

Biotin (meg) 

.00652 

.941 

Pantothenic acid (mg) 

.00049 

.820 

Iron (mg) 

.04743 

3.550 

Calcium (mg) 

-.00375 

-2.130 

Magnesium (mg) 

.01178 

3.049 

Copper (mg) 

- .00468 

-.418 

Zinc (mg) 

.05536 

2.920 

Chromium (meg) 

-.05022 

-3.064 

Selenium (meg) 

.04613 

3.294 

Molybdenum (mg) 

.02721 

1.368 

Manganese (mg) 

.05851 

1.125 

Potassium (mg) 

.02124 

1.597 

Iodine (meg) 

- .00168 

-.514 


Model 1. — p, - a'U + e„ where the e, are independently and identi¬ 
cally distributed with E(e,) = 0 and a is a vector of parameters. 

Model 2.—pi - f{ti) + where the y, are independently and identi¬ 
cally distributed with E(y,) = 0 and, as suggested by Section III, / is 
convex and linearly homogeneous. 

A preliminary regression was run to estimate model 1. The results 
of this estimation are summarized in table 1. As can be seen from the 
table, the model fits quite well. The R 2 of .89 implies that the regres¬ 
sion is significant at even the .0001 level. 6 Most of the coefficients are 
of the correct sign, and many are highly significant. 

All the vitamins and minerals were represented in the sample as 
pure, single-ingredient tablets. This suggests a simple way to test 
whether or not model 2 is statistically indistinguishable from model 1 
for this data set: to estimate model 1 using only the data on pure 
vitamins and minerals and then use the remainder of the data as a 
validation check. 

Let Xi be the matrix of characteristics of the pure vitamins and let 

s Note that since no constant is being used in the regression, the definition and 
interpretation of R* are altered sfighdy (see Aigner 1971). 
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Xz be the matrix of characteristics of those vitamins that are combina¬ 
tions. Similarly, partition the observations on the prices p' = (pi, pi). 

If we define & - (XjX|)" x X[p\, we see that under the assumptions of 
model 1, & is an unbiased estimate of a. Define p - X?d and 8 - p - p. 
Finally, let 


K - 



-Li 

”2 i =1 


(pi pt)> 


where n 2 is the number of combination vitamins. Then, under the 
assumption that model 1 is correct, it is straightforward to verify that 
E(K) = 0 and var (K) = (o- 2 /n 2 ) + l' cox(d)l, where cr 2 = var(c,) and is 
the average level of the jth characteristic within the sample of com¬ 
bined vitamins. Note that, by construction, X| is an orthogonal design 
matrix so that cov(d„ & } ) = 0 if i ¥> j. 

If model 2 is correct and model 1 is not, it still follows (because of 
the linearity on rays of /) that £(<!,) = /(?,), where e, is the ith unit 
vector. Further, since / is convex, it follows that E(p,) E(p t ) for all i. 
Hence, E(K) «s 0 if model 2 holds. Thus, if model 2 holds and model 1 
does not, one would expect K to be negative and significant. Indeed, 
this is exactly what we find as K = - .0937 and var(/Q = .0014; hence 
KI&k - -2.492, which is highly significant. (Note that if the e, are 
normally distributed, K/d K has a /-distribution with 139 degrees of 
freedom.) 

Finally, as further evidence of the nonlinearity of p(t). note that the 
procedure outlined above overpredicted the true price (i.e., p, > p,) 
for 101 of the 114 multiple vitamins in the data set. 


The Labor Market 

The results are relevant to another empirical example as well. This 
example comes from the work on labor markets. In wage regressions 
it has long been noted that the regression coefficients of various 
worker characteristics differ significantly when the data are divided in 
obvious ways (see Heckman and Scheinkman [1987] for a complete 
discussion of this observation). For example, if the data are divided 
along North-South lines, the regression coefficients on education dif¬ 
fer in the two samples. For example, it is shown in the empirical 
results of Heckman and Scheinkman that significant differences in 
coefficient estimates are obtained for North-South, blue-collar-white- 
collar, and manufacturing-services splits of the data. 

The standard explanation for this phenomenon has been factor 
immobility. However, if this were correct, one would expect to see the 
higher coefficient in the sample with the lower concentration (e.g., in 
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this case the marginal value of education should be higher in the 
South). In fact, exactly the opposite holds, and so the empirical result 
is still a puzzle. 

However (as argued by Heckman and Scheinkman), the character¬ 
istics model does offer one explanation for this phenomenon. In fact, 
the convexity of the price function in characteristics is precisely what 
is needed. It implies that the marginal value (measured by price 
change) of a characteristic is higher in those regions in which the 
characteristic is more concentrated. Thus, if the market is reasonably 
described by different geographical production functions and freely 
mobile labor, the characteristics approach provides a solution for this 
puzzle. (Note that, in sharp contrast to the explanation offered above, 
the characteristics model requires that all factors be freely mobile.) 


V. Concluding Remarks 

1 close with a few brief remarks. 

1. Adding other goods to the model is a straightforward extension. 
For example, suppose that there are K other goods to be considered. 
Let x E R* denote consumption levels of these goods. Then consump¬ 
tion sets of households are of the form Z = R + X Af, where Af is as 
before. If individuals have preferences on Z of the form 

V h (x, m) - U\x, C](w). .... 

where U h is a standard utility function on R+ the conclusions of 
propositions 1 and 2 concerning the form of the restriction of p to T 
are still valid. 

In addition, if U h (x, c) can be written as U h (x, c) — U\(x) + Us(c), 
where t/ 2 is homothetic and does not depend on h, the conclusions of 
propositions 3 and 4 remain valid. To see this, simply apply the argu¬ 
ment of proposition 3 using the equilibrium expenditure on goods in 
T in place of income. 

Note that preferences of this form need not be homothetic on Z. In 
particular, it is possible that prices are linear on T even though expen¬ 
diture on goods in T is not a constant percentage of income. Thus it is 
quite possible that a linear regression of prices on characteristics 
within an industry would perform quite well even though the income 
elasticity of demand for products from that industry is not one (cf. 
Muellbauer 1974). 

2. Propositions 1—4 have obvious extensions to the case in which 

V*(m) = U h (j T gdt)dm . 
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where the g,’s are nonnegative and continuous functions of t. For 
example, p is convex in g and linear on “g-rays” in this case. In partic¬ 
ular, if the gi are convex functions of t (i.e., increasing marginal value 
of characteristics), we can still conclude that p must be convex in I. 6 

3. The results reported here depend very heavily on the implicit 
assumption of perfect divisibility. This can be readily seen in the 
proof of proposition 1. If goods are not perfecdy divisible, marginal 
analysis (over quantities) is not in general valid and the argument 
breaks down. If the collection of produced goods (S) has a nonempty 
interior and we assume that households buy only one. unit of the 
good, marginal analysis on the levels of the characteristics can be 
performed. This is the approach adopted by Rosen (1974). 

We can relax Rosen's assumption that only one unit of one good in 
T can be bought by allowing for the possibility of purchase of more 
than one item while still retaining indivisibility. In this case, the con¬ 
sumption set AT C Af is given by 

AT = {m|/ e supp m implies m(t) is an integer}. 

Note that this restriction says that consumption must be in integer 
quantities but not the characteristics of the goods themselves (i.e., /’s). 

This is the basis of the model analyzed in Mas-Colell (1975) and has 
the appealing property that, although consumption must be in inte¬ 
ger quantities, this integer (e.g., number of television sets or auto¬ 
mobiles consumed) is endogenously determined. In this case, the 
linearly combinable preferences given in (1) are still a possible 
specification and analogues of propositions 1 and 2 still hold. It is easy 
to see that in this case, p(t t + t 2 ) p(( t ) + p(t s ) if fj, t 2 , and <i + t 2 are 

in S. This does not imply that p is convex. For example, a p that is 
convex across rays in S and concave along rays in S will satisfy this 
restriction. In particular, the semilog price function that is often used 
in hedonic regression (see the articles in Griliches [1971]), In pit) = 
<Mi + . . . + a/j, is of the required form. 

4. It has been pointed out that the state preference approach to 
modeling asset markets is a special case of the Lancasterian model. 
This can be readily seen by identifying each possible state with a 
characteristic in the discussion presented above. Hence, the results 
reported here apply to that model as well. Realistically, there are two 
reasons why our results probably do not have much bite in this case, 
however. First, if there is any independent variation in the profits of 

6 Note that the convexity of g as a function of (is quite consistent with the concavity of 
V* as a function of m. This will be satisfied as long as U h is concave in its arguments. 
Note also that the concavity of V* does not (by itself) imply any restrictions on the form 
of the equilibrium price function. 
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different firms, the number of states is of an order of magnitude 
equal to the number of firms. In this case, neither convexity nor 
linearity of p implies anything about the relative prices of securities 
written on the different firms. Second, even in the case in which the 
number of securities is larger than the number of states, the possibil¬ 
ity of short sales implies that "bundling'’ phenomena are much less 
important. As noted above, it follows that prices will be linearly de¬ 
composable except in the case in which some agent has zero consump¬ 
tion in some state. In addition, if the security payoff vectors “span” 
the space of state-contingent consumption possibilities, it follows that 
the hedonic prices are those of state-contingent consumption claims. 
If, for some reason, short sales are disallowed, our results imply that 
securities prices must be convex functions of their state-contingent 
payoffs. 

5. Because of the problem of indivisibilities discussed in point 3, it is 
difficult to give examples of consumer goods industries in which the 
results of Section III are directly applicable. Hence, most of the inter¬ 
esting applications of the results probably occur when the buyers are 
firms. 

The results are interesting for another reason, however. They show 
how interrelated the demands for the outputs of “monopolists” can 
be. Of course, the location model provides a “local” example of this 
phenomenon. What propositions 2 and 4 show is how these relation¬ 
ships can place qualitative “global” restrictions on demands when the 
assumptions of the characteristics model are met. 

6. It is natural to ask: Can proposition 3 be extended to cover any 
larger interesting class of cases? To gain more insight into this ques¬ 
tion, return to figure 1. As can be seen from the discussion in Section 
II, linearity will hold only if the marginal rates of substitution of the 
characteristics agree. This is equivalent to having the equilibrium allo¬ 
cation of characteristics lie on the “characteristics contract curve,” 
AEC. The only way to guarantee that this holds is to make sure that 
AEC is contained in the “true Edgeworth box,” ABCD (otherwise, 
whether or not prices are linear will depend on the initial distribution 
of resources). In general, the shape of ABCD depends critically on 
what endowments are. However, it follows that it is always convex and 
always contains the diagonal of the box. It is exactly equal to the 
diagonal if and only if all produced goods have the same relative 
proportions of characteristics. It is the entire box if and only if all 
produced goods are pure characteristics (i.e., there is no bundling). 

From this discussion, it is easy to see why proposition 3 works. 
Under the assumptions there it follows that the contract curve is the 
diagonal of the box. From the discussion above these combinations 
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are always feasible. Thus all equilibria lie on the diagonal, and so 
trices must be linear. 

It is easy to see that there are no other restrictions on preferences 
that will work: if the contract curve is not the diagonal, construct 
endowments so that not all of AEC is feasible. Then there will be 
efficient allocations where supporting prices are nonlinear. Alterna¬ 
tively, one could ask for restrictions on endowments such that the 
desired linearity holds for all preferences. It is easy to see that this 
requires that endowments be in pure characteristics only. 

Beyond this, any generalization of proposition 3 would have to 
involve joint restrictions on endowments and preferences. 

Finally, note that these considerations imply that both linearity and 
nonlinearity of prices are “robust" cases. That is, if AEC is contained 
in ABCD, making small changes in preferences or endowments will 
not change this, and hence prices will be linear. Similar reasoning 
applies if AEC lies entirely outside of ABCD. 
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In this paper we contrast panics and information-based bank runs in 
an effort to provide a robust and empirically plausible model of how 
bank runs are triggered. The model of information-based runs is 
characterized by two-sided asymmetric information: the bank cannot 
observe the true liquidity needs of the depositors while depositors 
are asymmetrically informed about bank asset quality. We also exam¬ 
ine the relative degrees of risk sharing provided by bank deposit 
contracts and traded equity contracts. We show that the choice of 
deposit or equity depends on the attributes of and information about 
the underlying investment returns. 


I. Introduction 

For more than two decades much research activity has been devote< 
to understanding the microeconomic underpinnings of financial ir 
termediation and the specialness (if any) of commercial banks (e.g 
Gurley and Shaw 1960; Tobin 1963; Stiglitz 1974; Fama 1980, 1985 
Since the (apparent) general equilibrium effects of financial crise 
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particularly over 19S0-33, have motivated much of the theoretical 
and empirical banking research, the linkage between the foci of aca¬ 
demic research and governmental regulation has been particularly 
close. In the recent past, spurred on by the most significant changes in 
banking regulation since the 1930s, a stream of literature has begun 
to reexamine and extend the theses in prior research. Examples of 
such work include Kareken and Wallace (1978), Black, Miller, and 
Posner (1978), Bryant (1980), Bhattacharya (1982), Diamond and 
Dybvig (1983), Jacklin (1983), Chari and Jagannathan (1984), Dia¬ 
mond (1984), Smith (1984), and Bhattacharya and Gale (1985). 
Kareken (1986) attempts to summarize the historical and conceptual 
issues pertaining to governmental regulation of banking. 

Much of this work can be usefully classified (see also Nakamura 
1984) as focusing on one or more of the following issues. First, tech¬ 
nological economies of scale in monitoring and, more subdy, di¬ 
versification-based cost ecotiomies in information signaling have been 
used to rationalize both the existence of financial intermediaries and 
(with strong assumptions) the fixed commitment nature of their con¬ 
tracts. 1 Second, it has been argued, informally by Tobin (1963) and 
formally by Diamond and Dybvig (1983) following on Bryant (1980), 
that intermediary contracts transform highly illiquid asset-payoff 
streams into more liquid liability payoffs. Third, the role and impact 
of governmental regulations in the areas of (a) deposit insurance, 
( 1 b ) branching and other competitive restrictions, (c) deposit interest 
rate controls, ( d ) asset portfolio choice (by banks), and (e) monetary 
control have also received much scrutiny. 

Our work here focuses on the second issue of liquidity transforma¬ 
tion by banks as well as on the normative theory of welfare-optimal 
portfolio choices by intermediaries offering fixed-commitment con¬ 
tracts in an effort to ensure ‘‘preference shocks” as formalized in 
Bryant (1980) and Diamond and Dybvig (1983). In contrast to Dia¬ 
mond and Dybvig, who use their framework to rationalize a role for 
deposit insurance even in the absence of risky bank assets, our analy¬ 
sis deals much more closely with the (optimal) choice of riskiness of 
bank assets. This difference in emphasis emanates from fundamental 
differences in our “world views” regarding the nature of bank runs 
that result in premature disinvestment of long-lived assets. 

Diamond and Dybvig view runs as “sunspot” phenomena that arise 
from completely unpredictable choices among Pareto-ordered Nash 
equilibria by agents (depositors). Our viewpoint, which is in greater 


1 Townsend (1979) was the first to rationalize fixed-commitment contracts on the 
basis of asymmetric knowledge of ex post payoffs. Diamond (1984) uses an extension of 
Townsend’s model to rationalize intermediation (multistage communication). 
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accord with the less fully formalized work of Bryant as well as the 
recent paper of Chari and Jagannathan (1984), emphasizes the role of 
interim private information about bank loan/asset payoffs on the part 
of depositors as the source of runs. The welfare implications of such 
behavior are shown to depend in important ways on (a) the relative 
response of deposit contracts versus traded contracts to such interim 
information and ( b) the illiquidity of long-lived assets in a sense that is 
not captured by Diamond and Dybvig; that is, in the short run, their 
payoffs are strictly less than those of short-lived assets. These welfare 
consequences are shown to have interesting implications for the 
choice of intermediary contract forms, specifically between nontraded 
deposits and traded equity, in that the optimal choice can depend on 
the underlying risk and informational attributes of assets. 

In the present paper our goal is to develop some of the basic “build¬ 
ing blocks” of a framework that is capable of a sustained and in¬ 
tegrated attack on the conceptual and regulatory issues relating to 
intermediation that were outlined above. In the process, we also 
establish deeper connections between the recent banking literature 
and (i) welfare analysis of informational market equilibria (Hirshleifer 
1971; Allen 1983) as well as (ii) the theory of intertemporal hedging 
in financial markets (Merton 1973; Breeden 1984). To keep the 
framework simplified, we deal with a model in which governmental 
deposit insurance is nonexistent. While we find our examples on the 
implications of asset attributes for intermediary contract choice to be 
of interest, we regard the methodological emphasis of this paper to be 
equally important. 

In Section II, our view and its implications for modeling and (regu¬ 
latory) policy choices are developed in greater detail. In Section III, 
we present an example of ex ante optimal contract choice in the 
presence of interim information that illustrates the dependence on 
asset attributes. Section IV, which concludes the paper, discusses the 
limitation of this example and suggests directions for further research 
(including those we are already pursuing). 

II. A Framework for Analyses of Banking 
Contracts 

A. Modeling Choices 

In common with Bryant (1980) and Diamond and Dybvig (1983), we 
assume that all agents are bom with T = 0 endowments of unity and 
have (“derived”) utility of consumption at T = 1 and T - 2 only. Two 
investment technologies are assumed: a short-lived one from T = 0 to 
T = 1 and a long-lived one from T = 0 to T -2. Agents’ consump- 
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(ion preferences over T = {1,2} are assumed to be identically random 
at T == 0, and this uncertainty is assumed to be resolved but privately 
known by each agent at T « 1; that is, the random variables involved 
are assumed to be independently and identically distributed across 
agents. 

Our further assumptions differ from those of Diamond and Dybvig 
in three important respects. First, they assume that at T = 1 agents 
have (realized) utility for either T = 1 or T = 2 consumption only 
(equivalendy the two are perfect substitutes for a subset of agents). In 
contrast, we assume that, conditional on all realizations in the support 
of the preference-shock random variable, agents’ consumption pref¬ 
erences are describable as smooth in the sense of Debreu (1972) with 
strictly positive utility for both (T = 1, 2) consumptions satisfying 
infinite marginal utility at zero consumption. The rationale is as fol¬ 
lows. As Jacklin (1983) has noted, the Diamond-Dybvig specification 
with no aggregate uncertainty about preferences—given the optimal 
investment decisions at 7=0 described below—leads to the feature 
that the ex ante optimal consumption allocation is implementable 
through trading. Shares of the investment portfolio (of short- and 
long-lived assets) can simply be traded at T = 1 as with a mutual fund, 
and thus the Fama (1980) critique of the specialness of banks is rele¬ 
vant. In contrast, with smooth preferences, nontraded demand de¬ 
posits may attain a welfare-superior allocation if both assets have cer¬ 
tain payoffs. 2 

Second, we assume that the return on the long-lived technology is a 
random variable, R, about which there is interim information at T ~ 
1 that is asymmetrically observed by some depositors. This asymmetry 
makes for the inflexibility in the terms of the menu of deposit con¬ 
tracts offered (in response to interim information). Both features are 
essential to obtain a welfare ordering over deposit versus equity con¬ 
tracts that is dependent on asset attributes, for example, riskiness of 
R. That is so because the Jacklin (1983) argument on the superiority 
of nontraded deposits easily extends to risky R with no interim infor¬ 
mation and because welfare-decreasing early withdrawals by agents 
with greater preference for T = 2 consumption arise in our model 

2 The essence of this result lies in the observations that the ex ante optimal (expected 
utility) allocation is not necessarily the competitive equilibrium from equal endowments 
and that trading imposes coalitional incentive-compatibility requirements, implying 
that (in a large economy) only the competitive equilibrium above is implementable. 
Bhattacharya and Gale (1985) have noted that, while the ex ante optimal investment 
allocation is not necessarily (traded liquidation) value maximizing, this does not cause a 
problem (with trading restrictions) if intermediaries are representative (see also Camp¬ 
bell 1984). Both Bhattacharya and Gale and Smith (1984) analyze the heterogeneous 
intermediary cases in different settings. 
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The "distortionary” effects of interim information mentioned 
above may be contrasted with those for market-based traded (mutual 
fund) equity contracts (e.g., Allen 198S; Laffont 1985). In the present 
context such contracts would involve the two types of trading shares, 
each with the dividend pattern {L, R(l - L)}, ex-dividend at T « 1 + . 
As both Allen and Laffont note, except in very special circumstances 
(involving pretrade endowments being equilibrium ones for some s 
and sufficiently complete markets), the effect of interim information 
is to decrease the ex ante expected utility V** obtainable from equity 
contracts, particularly when production decisions are not (sufficiently) 
reversible/alterable at T - 1. Thus, at least in the context of our 
modeling of real investments, equity (and deposit) contracts “suffer” 
from the presence of interim information (which is a priori asym¬ 
metrically known in the absence of trading). 

However, there is a key difference in the manner in which condi¬ 
tional expected utilities V(s) are affected with equity versus deposit 
contracts. For “plausible” scenarios, elaborated further in Section III 
below, the incentive compatibility of the (inflexible) deposit contract is 
affected for only “low” realizations of s, for example, a first-order 
stochastically worse conditional distribution of I?(s). In contrast, at 
least some agents suffer conditional expected utility losses (relative to 
the case of no interim information), given the equity contract, over 
the whole range of the distribution of the signal (r). Intuition obtained 
from single-person decision theory (e.g., Rothschild and Stiglitz 1970) 
thus suggests that, in the presence of interim information, the ex¬ 
pected utility level V* obtained from deposit contracts may exceed the 
level V** from equity contracts for low-risk distributions of R, and 
vice versa. The degree to which this intuition holds, as well as the 
effects of other variables (e.g., the extent of information in s) on the 
V* versus V** comparison, is the focus of Sections III and IV below. 
The results depend on the degree to which “reverse hedging” (e.g., 
Breeden 1984) is important, that is, whether agents prefer to decrease 
their T - 1 consumption in response to bad information about R or 
not. 

III. Attribute-dependent Contract Choice: 

An Example 

In this section, we present a parameterized example of the model 
described in Section II. For this example, we show that whether non- 
traded deposit contracts or traded equity contracts are preferred de¬ 
pends on the riskiness of the long-term investment technology as well 
as the structure of information regarding the asset’s return. 
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Two alternative economies are compared. They are identical ex¬ 
cept that in one risk sharing is accomplished with demand deposits 
while in the other it is accomplished with equity shares. In both econo¬ 
mies the following specifications of the model of Section II are as¬ 
sumed. 

Preferences. —In equation (1) define t/(c,) = Vc~ pi = p < 1, and 

P2 * 1. 

Two-period, investment technology. —Define R, the total return from 
the two-period investment technology, as being i?/ with prior (at T - 
0) probability 0 and R/, with probability 1-0, where Rh > 1,0 < R t < 
R h . For simplicity, we assume that the two-period technology cannot 
be liquidated early (or, equivalently, only at a total loss). The private 
information banks possess about their loan portfolio may lead to 
inefficient liquidation of loans due to Akerlof’s (1970) “lemons” prob¬ 
lem. The lack of secondary markets for many types of loans also 
follows from this information asymmetry. Nonetheless, the assump¬ 
tion of total illiquidity is much stronger than necessary. We need only 
assume that early liquidation of the two-period investment is never 
optimal. 

Information. —At T = la proportion a of type 2 individuals observe 
a signal s, which they use to update their prior assessments on R? The 
signal is described by the distribution of posterior beliefs to which it 
could lead. Given that R t and are fixed, 6 describes the posterior 
beliefs about R. The posterior beliefs are always consistent with the 
priors. We have 8 = 2, prob(i)0„ where is the value of 0 given that 5 
is observed. 

A. Demand Deposit Economy 

First, formally define “demand deposit” as a contract that requires an 
initial investment at T — 0 in exchange for the right to withdraw per 
unit of investment (at the discretion of the depositor and conditional 
on the bank’s solvency) either x I in period 1 and x 2 in period 2 or yi in 
period 1 and y 2 in period 2. As part of the definition, assume that 
trading in demand deposits is prohibited. Jacklin (198S) demonstrates 
that this contract optimally combines the two types of deposits that 


7 Although not modeled here, this assumption is motivated by the fact that if infor¬ 
mation were costly, then the “late diers" (i.e., those who prefer second-period consump¬ 
tion relatively more) would be more likely to purchase information. Furthermore, if 
depositors were of different sizes (again not modeled here), then larger depositors 
would be more likely to purchase the costly signal. Thus we capture these unmodeled 
aspects of the problem by assuming that only a fraction of the type 2 individuals 
observe the signal. 
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individuals would hold: a two-period savings deposit and a more typi¬ 
cal demand deposit. 

The uncertain second-period return reflects the fact that, having 
invested in a risky technology, the bank may not be able to make its 
promised second-period payments in full. One way to think of this is 
that the bank promises an amount it will be able to pay only if ft = ft*. 
If ft = ft/, the bank is considered insolvent and depositors get ft//ft* of 
their promised payments. 8 There is no deposit insurance in this 
model. 9 

We now explore the characteristics of demand deposits in the 
specified economy. First, we solve the constrained social optimiza¬ 
tion—(2), (3a), (3b), and (4)—to compute the terms of the demand 
deposit contract given the commonly held prior belief about the dis¬ 
tribution of ft and ignoring the impact of any interim private infor¬ 
mation depositors may receive. Then we analyze the effect of interim 
information on the incentive compatibility of the demand deposit 
contract. 

The principal result of this subsection is the proposition that char¬ 
acterizes the relation between the run threshold level of the posterior 
estimate of 0, call it 8, and the variance of the production process 
return, ft. The run threshold level indicates the value of 8 over which 
the informed type 2 individuals prefer the type 1 withdrawal. The 
threshold level, 8, is shown to be inversely related to the variance of 
theft. 

This analysis examines a deposit contract designed under the as¬ 
sumption that there is no interim information about asset quality and 
then demonstrates the impact of introducing such information. Obvi¬ 
ously, the presence of such information should affect the deposit 
contract terms. In subsection C, the numerical examples incorporate 
the presence of the interim information under optimization to deter¬ 
mine the deposit contract terms. Leaving the interim information out 
at this point greatly simplifies the analysis. Moreover, there are no 
qualitative differences in the results based on these analytic results 
when compared with the (slightly) more general numerical analysis. 

Given the assumed preference structure, the unconstrained social 
optimum is never incentive compatible. It has the characteristic that 
both types receive the same allocation in the first period, but the type 


® That is, if R = R k , then = x t and J 2 = y 3 . However, if R = R h then = x 3 RilR k 
and = y 3 Ri/R k . 

9 This simplification is “justified” because of our focus on a model of the banking 
system as a whole, with representative banks. Thus the interim information in our 
model pertains to aggregate shocks; a much more complex (imergenerational) model is 
needed to consider deposit insurance in such a context. 
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2 individuals (who do not discount second-period consumption) re¬ 
ceive a strictly larger allocation in the second period. In solving the 
constrained optimization, it is necessary only to consider the type 1 
incentive-compatibility constraint because it can be shown that the 
type 2 constraint is never binding for the solution to the singly con¬ 
strained problem. 

Given the shared prior on R, the demand deposit terms are com¬ 
puted by solving for the constrained optimum. When the problem is 
solved, the bank sets its contract terms so that xi = cti. *2 = - 

c% and y% — c| 2 . If R = R h , the bank pays its promised second-period 
return. However, if R = R t , the bank is considered insolvent in the 
second period and pays R 1 IR 1 , of its promised payments. This is a 
plausible description of the actual proceedings when a bank becomes 
insolvent, and it is shown in the Appendix that this policy is socially 
optimal given the assumed preference structure. That is, the follow¬ 
ing modified optimization is equivalent to the constrained social opti¬ 
mization problem: 

max <(Vcu + pdVc 2 i) + (1 - 0(Vci7 + AVcijij) (5) 

subject to 

('■’ + + (*» + •£>' - '» ■ 1 (6 > 

and 

y/cj 1 + pA y/C'i [ - V'cjg p A Vc 2 2 ^ 0, (7) 

where A = 1 - 6 + ©(/?///?*) ,/5i . Notice that equation (7) is the in¬ 
centive-compatibility constraint that guarantees that the type 1 depos¬ 
itors will prefer the type 1 withdrawal stream (cn, £ 21 ) to the type 2 
withdrawal stream (c 12 , C 22 X 

The first-order conditions for the optimization above are 


( 1+ ?) 

1 

VZ 

Zl, 

(8a) 


0 Vcia 

= Zl, 

(8b) 

•4 + ?: 

1 

- Zl 

■ Rh’ 

(8c) 



_ z, 

Rh’ 

<8d) 

+ -t) + 

U - «(<„ + -j£) - 0 , 

(8c) 
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Vcu + pAVcjTi * VcTs + pAVcm, (8f) 

where ii > 0 and z 2 > 0 are the Lagrange multipliers associated with 
constraints (6) and (7). 

Solving (8a)-(8f) for C\ lt C] 2l and ca 2 , we get 

_ _1_ 

C " ” t(l + P*A 2 R a ) + (1 - 0(*? + A*R k Kl)’ 

Ci2 = K 2 C II, 

t.j.\ = (ApR h fc xx , 

C22 = ( ARt,Kz) 2 C XX , 

where 

K - 1 + M a **[P ~ ~ M 

1 1 + pA*R k H - 1(1 - p)] ’ 

K - 1 + - *0 ~ P>1 

2 1 + 9 A 2 R h [l “ <(1 ~ P)J ' 

Some type 2 individuals receive information that causes them to 
update their probability assessment of R « R x from 0 to 0. 10 Given this 
revised probability assessment, the informed type 2 individuals may 
prefer the type 1 withdrawal over the type 2 withdrawal. If this is the 
case, the bank will not have enough funds to meet the demand for 
type 1 withdrawals at T - 1. Our assumption is that the bank allows 
individuals to make type 1 withdrawals until a proportion t have done 
so. Beyond this point only type 2 withdrawals are allowed. The ques¬ 
tion we now investigate is for what values of $ these type 2 individuals 
prefer to make type 1 withdrawals and consequently upset the bank’s 
allocation scheme. 11 We show that the run threshold level, 8, over 
which type 2 individuals prefer the type 1 allocation is inversely re¬ 
lated to the variance of return. 

Define 



First, we need to know for what value of A' 

£[V(c, 2 ,*m, 2)] < fs[V(c,i, ?2i, 2)], 

>0 See n. 7 for the motivation behind this assumption. 

11 Notice that tome type 2’s making type 1 withdrawals will make everyone worse off 
in an ex ante expected utility sense since the deposit contract is the result of constrained 
ex ante expected utility maximization and since ex ante depositors know neither their 
type nor whether they will have any private information at T «= I. 


(9a) 

(9b) 

(9c) 

(9d) 
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where £ indicates expectation using the revised probability assess¬ 
ment; that is, for what values of A' 

V^cj2 + A'y /< Vcji + A' V C 2 i. 


This implies that 



Using (9b)-(9d) yields 


1 -*i 

AR h (K 2 - p)■ 


Substituting for A'i and K 2 and simplifying yields 


A' < 


P* a *»(l - p) 
AR„( 1 - p) 


= pA. 


Substituting for A and A’ and simplifying yields 

Define 

e = P e + (i - P) (_^L^). 

Then type 2 individuals who receive information that leads them to 
update their assessment of the probability that R = R { from 0 to 0 > S 
prefer the allocation (cu. C 21 ) to (C 12 , C 22 ). 

Now we show that 0, the threshold probability beyond which the 
type 2’s allocation is not incentive compatible, is a decreasing function 
of the dispersion between R * and R/. We want to show the impact of 
changing the dispersion between Rh and Ri —holding the mean re¬ 
turn and information quality constant—on 0. Therefore, let R h — R 
+ [A/(l - 0)] and R/ = R - (A/0). Increasing A increases the disper¬ 
sion between R h and R t , with the mean of the prior distribution of R 
held constant and equal to R. 

Now compute the relationship between A and 0. Differentiating 0 
with respect to A yields 
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»nce dRJdb. * 1/(1 - 0) and dR t /d& = -(i/0). Clearly the de¬ 
nominator is positive. Therefore, the sign of the numerator is the sign 
of the entire expression. The numerator is clearly negative since both 
terms inside the braces are negative. Thus </6/dA < 0. That is, the 
threshold level at which type 2 individuals prefer the type 1 with¬ 
drawal stream decreases as the dispersion of Ri and R A increases. Thus 
we have the following proposition. 

Proposition. In the case of square root utility and a two-point 
support return distribution, with the mean return and information 
structure held constant, the run threshold level, ft, is inversely related 
to the variance of return. 12 

B. Equity Economy 

In this subsection we derive the equations that characterize the com¬ 
petitive equilibrium in an equity economy for the specified case. 
Equity pays a total dividend of a at T — 1 and a liquidating dividend 
of R(l - a) at T = 2, where a is chosen to maximize ex ante expected 
utility in this economy. A market for ex-dividend shares opens at 
T = 1. In this economy the price will be fully revealing and thus will re¬ 
flect all the information available about the quality of the underlying 
asset. Given the assumed structure, the type l’s problem at T — 1 is 

max VcTj + ApVcai 

<ii, 

subject to 

a + R a ( 1 - a)P B = c u + P*c ti , 

where A - (l - 0) + 0 V/?///?*, a is the amount of the first-period 
dividend, and P 9 is the price of 1 /[/?*( 1 - a)] ex-dividend shares (i.e., 
a claim that pays one unit if R - R A and Ri/Rh units if R = Ri). The 
subscript 0 on P indicates the dependence of P on information about 
0. The first-order conditions to this maximization are 



and 

a + /?*(! — fl)Pe = Cj 1 + P 0C2 1 > ( 11 ) 


** Although we are able to obtain only closed-form solutions for the square root 
utility case, we believe that this result holds for all constant relative risk aversion utility 
cases with relative risk aversion less than one, so that bad news about asset returns leads 
to a preference for the type I consumption pattern. The case of relative risk aversion 
greater than unity is discussed in Sec. IV below. 
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The equivalent first-order conditions for the type 2’s at T * 1 are 


l/sH- 

V c 2 2 


and 


* + f?*(l ~ 0)P 9 — C 12 + P%Ci2- 
Market clearing requires that 

a = ten + (1 - t)c 12 

and 

(1 - a)R k — ten + (1 — <)C 22 - 


( 12 ) 


(13) 


(14) 


(15) 


Equations (10)—(15) define a competitive equilibrium in this economy. 
Equation (10) implies that 


_ p 2 A 2 c M 


C 21 - 


Substituting (16) into (11) and simplifying yields 

, _ /V + /?<■(! ~ a)Pj 

11 P e + p 8 A 2 


Equation (12) implies that 


_ d 2 Ci2 

~FT 


Substituting (18) into (13) yields 


_ _ P«a + R h (l - a)Pi 

Cl2 ---—- . 

P e + A 2 


(16) 


(17) 


(18) 


(19) 


Substituting for c>i and Cj 2 in equation (14) using (17) and (19) and 
solving for P 9 yields the following cubic expression for P B : 


R h {\ - a)Pl + [t + (l - t)p 2 ]A 2 «*(l - a)P'i 
+ aA 2 [t(l - p 2 ) - l]P e - ap 2 A 4 = 0. 


( 20 ) 


The value of P e is found by solving for the positive root of equation 
(20). This value is then substituted into equations (16)—(19) to find 
the competitive allocation given the information about 8 available at 
T - 1. This allocation is then used to compute the ex ante expected 
utility in this economy given the information structure. 
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C- Welfare Comparisons: Numerical Examples 

In the previous two subsections we characterized the allocations 
achieved in deposit and equity economies. In this subsection we illus¬ 
trate the relation between the choice of contract form and each of the 
following: (1) variance of the asset return, (2) availability of informa¬ 
tion. and (S) quality of information. 

In these examples, the allocations in the equity economies are de¬ 
termined as described in the previous subsection. That is, the alloca¬ 
tions are the result of choosing the dividend that maximizes ex ante 
expected utility given that the ex post allocation for each information 
state is the result of a fully revealing competitive equilibrium. 

In the deposit economy, the deposit contract terms are also chosen 
to maximize ex ante expected utility when the impact of the interim 
information available at T = 1 is taken into consideration. In the 
event that this interim information leads to a run, it is assumed that all 
the type 1 but only the informed type 2 depositors attempt to with¬ 
draw early (make the type 1 withdrawal). Since this demand cannot be 
met, it is assumed that the available funds are allocated randomly 
among the type 1 and informed type 2 depositors. That is, it is as¬ 
sumed that each depositor attempting to make a type 1 withdrawal 
arrives in line randomly and the depositors are then treated on a first- 
come-first-served basis. In the second period, if the bad state occurs, 
depositors are assumed to receive /?///?* of their promised payment. 1 * 
Let p = 0.8, t = 0.5, 8 = 0.2, and R = 1.5. Given these parameters, 
figures 1-4 display the difference between the expected utility using 
demand deposits and the expected utility using equity contracts as a 
function of A, which measures the dispersion of the asset return. Each 
figure presents three graphs (except fig. 4, in which the three graphs 
coincide everywhere) that represent different levels of information 
availability (i.e., a ~ 0.25, 0.50, 1.0). The figures differ by the infor¬ 
mation structure in the alternative economies. In figure 1 the infor¬ 
mation structure is 

with probability .8 s — S| => 8 = .1, 

with probability .2 s = s% =£> 0 = .6. 

In figure 2 the information structure is 

with probability 2/3 s = sj ^ ft = .1, 

with probability 1/3 s = => § = .4. 

,s We retain this assumption in the interest of realism, even though ex ante utility 
maximization would lead to type 1 and type 2 withdrawers' being treated differently in 
the bad state when interim information is explicitly considered in the optimization 
problem. 
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Fig. 2.—Expected utility using demand deposits less expected utility using equity as a 
function of A, the measure of dispersion in the return H, assuming the second informa¬ 
tion structure. 
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Fig. 3.—Expected utility using demand deposits less expected utility using equity as a 
function of A, the measure of dispersion in the return R, with perfect information. 



Fig. 4.—Expected utility using demand deposits less expected utility using equity as a 
function of A, the measure of dispersion in the return R, with no information. 
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In figure S there is perfect information. That is, 

with probability .8 s - ii ^ 0 = .0, 
with probability .2 s = s 2 => = 1.0. 

Finally, in figure 4 there is no information. 

In each of the first three figures for low levels of A the difference is 
positive (i.e., deposit contracts are preferred). 14 By our earlier propo¬ 
sition, we know that as A increases, the run threshold level decreases. 
For large enough values of A, s 2 (the bad signal) leads the informed 
type 2 individuals to attempt to make type 1 withdrawals. This occur¬ 
rence is depicted as a downward jump in each of the graphs. The 
figures also clearly show that the greater is the percentage of in¬ 
formed type 2 individuals, the worse deposit contracts perform rela¬ 
tive to equity contracts. Comparing figures 1, 2, and 3, one sees that 
the size of the downward jump and the level of A at which it occurs 
are both functions of the information structure. The higher the prob¬ 
ability of the bad signal, the larger the jump size is, whereas the 
higher the value of 6 given the bad signal, the lower is the level of A at 
which the jump occurs. 

Another striking characteristic of figures 1-3 is that after the 
downward jump the graphs are all increasing in A. This reflects the 
fact that the informed type 2 individuals begin making type 1 with¬ 
drawals as soon as it is marginally better for them to do so. As A 
increases beyond that point, the informed type 2 individuals who 
successfully make type 1 withdrawals benefit relatively more from 
making the type 1 withdrawal than the type 1 individuals who are 
forced to make type 2 withdrawals are harmed. Thus the expected 
utility from using deposit contracts increases relative to the expected 
utility from using equity contracts as A increases beyond the level at 
which runs are introduced. Notice that, for all three levels of a in 
figure 3 and for a = 0.25 in figure 1, for very large values of A, 
deposit contracts may again be the preferred instrument. 

Figure 4 represents the no-information case. In this case there are 
never runs with demand deposits. Therefore, they dominate equity 
contracts for all levels of A. 


IV. Discussion 

In the last section, we provided an example demonstrating (1) that 
nontraded deposit contracts may or may not be preferred to traded 


14 See n. 2 and jacklin (1985) for a more detailed explanation of this result. Jackiin 
(1985) also contains a more detailed description of the construction of these examples. 
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equity-type contracts and (2) that this preference depends on the 
riskiness of the underlying assets held by the intermediary as well as 
the nature and availability of information about the underlying assets. 
The observation from these examples that deposit contracts tend to 
be better for financing low-risk assets is consistent with several casual 
observations about the banking industry. On the liability side of the 
balance sheet, banks are by far the most levered firms in the economy. 
This is true even in countries without deposit insurance. This degree 
of leverage would be very difficult to maintain if banks held highly 
volatile assets. Moreover, bank assets are composed primarily of 
short- and medium-term debt instruments, which have cash flows and 
values that are much less volatile than equities. This is true (see Lan- 
gohr and Santomero 1985) even in European countries, such as Ger¬ 
many, where banks have few constraints as to their asset holdings. 

In this section, we discuss the example of the last section further 
and in doing so highlight the differences between information-based 
runs we have modeled and pure panic runs modeled by Diamond and 
Dybvig (198S). Our hope is that this discussion will clarify the relevant 
issues as well as motivate additional research in this area (including 
our own). 

In Diamond and Dybvig’s model, runs occur because depositors 
collectively choose a Pareto-dominated equilibrium. Had depositors 
known ex ante that a bank run was going to occur, they would not 
have deposited their money in the first place. Thus the question of 
what triggers the run is not addressed. On the other hand, we address 
this question by explicitly modeling interim information about the 
bank’s investment in risky long-lived assets. Moreover, in our model 
when a run takes place it is the only equilibrium. Furthermore, as we 
show below, in our setting bank runs are not a problem if either of 
two of Diamond and Dybvig’s assumptions is introduced. 

In our example, we assume additive square root utility for con¬ 
sumption, which implies that in any period the agents have relative 
risk aversion less than one. We also assume that the long-lived asset is 
totally illiquid. One interpretation of this assumption is that the asset 
represents long-term loans that cannot be “called in” early and for 
which no secondary market exists, perhaps because of a “lemons” 
problem as in Akerlof (1970). Admittedly our liquidity assumption is 
extreme, but as we noted, it is more extreme than necessary. 

On the other hand, Diamond and Dybvig assume that the relative 
risk aversion of their “comer” preference is greater than one. More¬ 
over, rather than assuming the existence of a long-lived and a short¬ 
lived asset, they assume that only one long-lived asset exists. This asset 
yields a total return of R (> 1) over two periods or can be liquidated at 
the end of one period with the investor recovering his initial invest¬ 
ment. 
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Now we analyze the impact of changing the assumptions made in 
our model to those made by Diamond and Dybvig. Consider the 
assumptions regarding the available investment technologies. If we 
replace our two investment technologies with the single technology 
assumed by Diamond and Dybvig, runs no longer present a problem. 
This is true because, with relative risk aversion less than one, the 
optimal first-period rate of return is negative. Thus an excess of type 
1 withdrawals does not jeopardize the ability of the bank to make 
payments in the second period . 15 Diamond and Dybvig point this out 
in their analysis in motivating their assumption that relative risk aver¬ 
sion is greater than one. Furthermore, this result is consistent with the 
hedging/reverse hedging results of Breeden (1984). Thus in this case 
pure panic runs do not exist since no one is worried about the bank’s 
ability to pay in the second period. On the other hand, information- 
based runs do exist in that everyone may choose to make type 1 
withdrawals. However, these runs do not present a problem since the 
long-lived asset is liquid enough to meet all demands. In fact these 
runs are beneficial in that they occur when the prospects for the long- 
lived asset are poor and lead the bank to liquidate a greater propor¬ 
tion of these assets than originally planned. 

The discussion above demonstrates that bank runs do not present a 
problem when long-lived assets are sufficiently liquid and depositors 
are not very risk averse. Now we consider the case in which depositors 
are more risk averse; that is, they have a relative risk aversion 
coefficient greater than one. This is the case in which Diamond and 
Dybvig demonstrate that pure panic runs can occur as “sunspot” 
equilibria. As we mentioned earlier, they leave open the question of 
what “triggers” these runs. Presumably, as we showed in our example, 
bad information about the bank’s assets does so. In this case, however, 
this presumption is not necessarily correct. As Breeden (1984) dem¬ 
onstrates for the class of utility functions with constant relative risk 
aversion, if people are sufficiently risk averse, they choose to reverse 
hedge. That is, given bad news, they choose to consume less now in 
order to invest more in the uncertain future. In our model this means 
that if depositors have a relative risk aversion coefficient greater than 
one, then bad news about the bank's assets may not lead them to run 
on the bank. This can be illustrated in the context of our model for 
the same class of utility functions examined by Breeden as follows. 

Consider the case in which U(c) in equation (1) represents prefer¬ 
ences exhibiting constant relative risk aversion with a relative risk 
aversion parameter greater than one. That is, U(c) - c l ~ y /(\ - 7 ) 


18 This can be demonstrated formally in our example by showing that if a proportion 
of depositors greater than t make type 1 withdrawals, then the budget constraint is not 
binding. This exercise is quite tedious and is not presented here. 
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with y > 1. It follows directly from the ex ante incentive and budget- 
constrained expected utility maximization thati M > ci 2 and c 21 < c 22 . 
Type 2 depositors will attempt to withdraw early if their conditional 
(on a revised estimate of 6, the probability that R - R t ) expected 
utility for a type l withdrawal exceeds their conditional expected 
utility for a type 2 withdrawal, that is, if 


,1 — Y ,1—Y ,1 —Y 

1-y 1-y 1-y 


+ A' 


4£Z 

1 - y* 


( 21 ) 


where A' — 1 - 6 + &(Rt/R h ) 1 v . This simplifies to 


A' < 


dr -dr 


4i 7 


cit y 


( 22 ) 


By construction, the allocation achieved with the deposit contract is 
incentive compatible given everyone’s shared prior. That is, we know 
that 


Cl 1 


y 


4- A 


4 1 7 


dr 


+ A 


42 7 


1-y 1-y 1-y 1-y’ 

where A = 1 - 6 + Q(R//R h ) 1 ~ y . This simplifies to 


(23) 


A > 


dr - rir 7 

4r 7 - 4a 7 ' 


(24) 


Given y > 1 and ft* > R t , bad news, that is, 0 > 8, implies that A' > A. 
It follows immediately that, given bad news, (22) will never hold. That 
is, bad news does not lead to a bank run. 

That bad news about a bank’s assets does not lead to a bank run 
seems to run against one's intuition. So we believe a brief clarification 
is in order. We are not suggesting that, given a sound bank and a bank 
in financial distress, depositors will keep their money in the distressed 
bank. This will obviously not occur. This is not the point of our model 
or any other in this area. Rather, these models have been interpreted 
as models of the banking system. Thus what our finding suggests is 
that bad prospects for the banking system as a whole do not necessar¬ 
ily lead to a flow of funds out of the system in the absence of alterna¬ 
tive stores of value. In a multibank system in which there may be bad 
news about individual banks or in an economy with other intertem¬ 
poral markets, information-based runs will obviously exist regardless 
of the risk aversion of depositors. Depositors would withdraw funds 
from the bad bank and deposit them into an alternate bank (store of 
value) about which there has been no bad news. For example, the 
systemwide bank runs in the Great Depression might be explained— 
given the rates of deflation and poor economic environment—by the 
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perception of currency and/or government bonds as a store of value 
superior to loan-backed bank deposits. We conjecture that, if mod¬ 
eled appropriately, our risk comparative static will hold generally in 
settings with multiple banks or alternative stores of value. That is, 
when the informativeness of signals is held constant, deposit contracts 
will be relatively better for financing low-risk assets while equity con¬ 
tracts will be relatively better for financing high-risk assets. Of course, 
this would be true only in the absence of deposit insurance. A natural 
role for deposit insurance as insurance against idiosyncratic risk 
would arise in this environment. 16 

In summary, we believe that this paper has illustrated the differ¬ 
ences between information-based runs and pure panic runs. We do 
not claim to have all the answers as to how banking panics should be 
modeled, and we certainly are not ready to make sweeping policy 
proposals based on our analysis. In fact, the principal purpose of this 
paper is to point out how very complex and subtle the issues in the 
area are and to caution those who would form regulatory policy on 
existing analyses. We hope that the issue we have raised will lead to 
research that will provide us with more robust and empirically plausi¬ 
ble answers to the unanswered questions in this area. 


Appendix 

In this Appendix we show that the banker’s problem solved in Section IIId is 
equivalent to the problem solved by a social planner who wishes to maximize 
expected utility subject to incentive compatibility. The social planner wishes 
to 


max <[Vru.+ p( 1 ~ 8)Vf 2U + p0Vc^] 
+ (1 — + (1 - 0) Vc 2 2A + «Vf2w] 


subject to 


,8 Note that there are two incentive effects associated with deposit insurance. The 
first is that bank managers have an incentive to take on risk to maximize the option 
value of the deposit insurance. The second is that since depositors are protected, they 
have less incentive to monitor the managers. 
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and 

\/ci7 + p(l ~ 8 )Vcjia + pSVcjij — Vc^T ~ 0 “ fl)Vc safc - sVejsi« 0, 

where the A and l subscripts indicate consumption in the status in which 
R *• R k and R «■ Ri, respectively. The first-order conditions to this optimiza¬ 
tion are 


(‘ * T^t )^7 - «• + -• 

"*(' + ?) vfe * ft - ' 


pzj \ 

1 

1 - J 



8 1 


_P*s_\_J_ = JL 
1 ~ t) Vcrit Ri' 


1 - f - + T) + (1 - i‘« + X) ‘ °' 
1 - ‘h + T) + (1 - i c " + 


(Al) 

(A2) 

(A3) 

(A4) 

(A5) 

(A6) 

(A7) 

(A8) 


and 


Vci7 + p(l — 8)Vc 2 u + p0Vc2u — Vcj7 - (1 - 0)Vc2j* — 0Vc 22 ( = 0, 

(A9> 


where Z|, z 2 . *s > 0 are the Lagrange multipliers for the constraints. 
The resource constraints (A7) and (A8) imply that 


mply 

that 

_ [1 - tf«s/(l ~ <)] ] 3 

1 1 + Mi) l 


cm 


Equations (A3) and (A5) imply that 

- [pz,/q - p}r _ 2 
1 + MO 

Equations (A4) and (A6) imply that 

- [pzs/(l - <)]) 2 .2 


P «21A- 


P Cut- 


(A10) 


(All) 


(A 12) 
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Substituting (All) and (A12) into (A10) yields 


C 2U , {1 

- [pz,/(l - <)))* 


Rk V 

1 + (*»/<) J 

P A* 

= £*•£ + ! 

l 

'W 

t ■■ 1 

1 

2 , 

-2 C21/ 

*. 1 

1 + (*s/<) J 

p w 


which simplifies to c 2 n * WRh)cw Similarly, c 22t = (R,/A*)c 22 *. 

Substituting for c 2U and c 22t in the social planner's problem yields the bank¬ 
er’s problem stated in Section I11A. 
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A Neoclassical Model of Unemployment and 
the Business Cycle 


James D. Hamilton 

University of Virginia 


This paper investigates a general equilibrium model of unemploy¬ 
ment and the business cycle in which specialization of labor plays 
a key role. A rational expectations equilibrium with fully flexible 
wages and prices can exhibit unemployment in which the marginal 
product of employed workers exceeds the reservation wage of those 
who are without jobs. Workers are unemployed either because they 
are in the process of relocating for a better job or because they are 
waiting for conditions in the depressed sector to improve. Moreover, 
seemingly small disruptions in the supplies of primary commodities 
such as energy could be the source of fluctuations in aggregate em¬ 
ployment and can exert surprisingly large effects on real output. 


I. Introduction 

This paper argues that specialization oflabor and capital accounts for 
many of the features observed for unemployment and the business 
cycle. The theme is not new. Ricardo’s (1817) chapter 19, "On Sudden 
Changes in the Channels of Trade,” articulates many of the issues 
taken up in this paper. Feldstein (1975) and Jovanovic (1979, 1984) 
have emphasized that the nontrivial allocation problem in matching 
workers with jobs best suited for their characteristics is crucial to 
understanding unemployment, while Black (1982), Lilien (1982), and 
Davis (1984, 1985, 1987) suggested that the difficulties in allocating 
labor across different sectors may play a causal role in business cycles. 


I am much in debt to Steven Davis, Jonathan Eaton, Maxim Engers, Marjorie Flavin, 
William Johnson, Ron Michener, Knut Mork, and Steven Stern for constructive com¬ 
ments on earlier drafts of this paper. 
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Dam (1984, 1985) and Loungani (1985, 1986) in particular have 
developed the theme pursued here that energy price shocks may be a 
key cause of sectoral imbalance. 

The current paper claims three contributions. First, I show that the 
role of specialization in unemployment and the business cycle can be 
rigorously grounded in a fully specified general equilibrium model 
with rationally formed expectations. Second, I suggest that the model 
is able to capture some of the features of what we normally think of as 
“involuntary" unemployment. Third, 1 show that large fluctuations in 
output could be generated by seemingly small disruptions in the sup¬ 
ply of primary commodities such as energy. 

These points address what are widely perceived to be some of the 
most important gaps in real business cycle theory. 1 The themes pur¬ 
sued here could have been explored in a sticky-price model, in which 
the effects 1 describe would be all the more dramatic. In modeling the 
essential rigidities in the economy exclusively in terms of specification 
of technology rather than through assumptions about suboptimal 
pricing arrangements, I hope to have called attention to the impor¬ 
tance of the former for our understanding of unemployment and the 
business cycle. 

The principal propagation mechanism of the business cycle ex¬ 
plored in this paper is the possibility that an energy price increase will 
depress consumer purchases of energy-using goods such as auto¬ 
mobiles. 2 The dollar value of such purchases may be large relative 
to the value of the energy they use. If labor were able to relocate 
smoothly from one sector to another, most of the lost output would be 
made up by gains in other sectors. On the other hand, if there are 
costs or delays associated with labor mobility, then the losses of one 
sector need not be regained by another, and the short-term aggregate 
loss can exceed the dollar value of the lost energy by a substantial 
margin. Moreover, the period of unemployment is not necessarily 
limited by the amount of time necessary to relocate. If there is some 
probability of a return to better conditions, unemployed workers may 
rationally choose not to relocate, even if jobs offered in other sectors 
pay a wage that exceeds their marginal utility of leisure. 

Even though the model is market clearing, unemployed workers 
are unhappy about losing their jobs and may envy currently em¬ 
ployed workers. The essential friction that accounts for this phenom- 


1 Among the important contributions to this literature are Kydland and Prescott 
<1982), Long and Plosser (1983), Barrow and King (1984), King and Plosser (1984), 
Kydland (1984, 1985), Hansen (1985), and McGallum (1986). 

* Davis (1984) has emphasized that an energy price decrease can also increase unem¬ 
ployment in this model. 
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enon is the requirement that workers must forgo one period’s wages 
if they wish to switch to a more prosperous sector. Subject to this 
technological constraint, wages and prices are perfectly flexible. 

The paper is organized as follows. Section 11 sets out the basic 
model, with equilibrium formally characterized in Section 111. Section 
IV discusses some of the advantages of this view of unemployment 
over alternative theories, while Secdon V analyzes the size of the 
disturbances necessary to result in such effects. 


II. The Model 

An individual worker may seek employment in either of two sectors. I 
assume that preferences and the production technology are such that 
the worker has no desire or opportunity to work less than the stan¬ 
dard workweek at a given job. Thus a particular worker is either 
employed full-time in sector 1, employed full-time in sector 2, or 
unemployed. 

In addition to the outputs from the two sectors, there is an unpro¬ 
duced good in the economy with which households are exogenously 
endowed. The economy wide supply at date t of this good is denoted 
X e . Households may buy or sell X to one another or to firms, and its 
price is determined endogenously. 

The general equilibrium model that I present below has the feature 
that a decrease in X t reduces the utility of workers in both sectors, 
though those employed in sector 1 are harder hit than those in sector 
2. Changes in labor’s marginal product in sector 1 could arise through 
either of two channels. The first might be described as a supply effect, 
in which X t is an input used along with labor in the production of 
good 1. When firms cut back on the use of this input, labor’s marginal 
physical product may be reduced directly. 3 4 The second channel might 
be described as a demand effect. In this case X appears as an argu¬ 
ment of the utility function of consumers rather than an argument of 
the production function of firms. An increase in the cost of X might 
make consumers want to cut back on their purchases of good 1; for 
example, an oil price shock may reduce the demand for new cars. 5 
The resulting decrease in the relative price of good 1 depresses 
labor's marginal product in sector 1 relative to sector 2. In an earlier 
version of this paper, I provided explicit parametric examples of both 
supply and demand effects and showed that in general equilibrium 

3 See Hansen (1985) for a detailed discussion of this very important assumption. 

3 For sticky-price versions of this argument, see Phelps (1978) and Mork and Hall 
(1980). 

5 See Bernanke (1983) for an interesting perspective on how this effect might occur. 
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the two are algebraically equivalent. In the current version, only the 
demand effects are modeled explicitly. 

Suppose we were told that the endowment of X was that a 
specified number Lu of workers turn out to be employed in sector 1 
(technically, iu denotes the Lebesgue measure since I model workers 
as a continuum rather than a discrete set), and L 2J workers turn out to 
be employed in sector 2. We can calculate what the levels of output, 
real wages, and relative prices must be to persuade firms to. hire 
workers in those numbers and to persuade consumers to purchase all 
that would be produced. One can then ask, for that specification of 
the vector of relative prices, what the current-period level of utility 
would be for the Ath individual worker under the following three 
possibilities: (1) the worker is employed in sector 1, (2) the worker is 
employed in sector 2, or (3) the worker is unemployed. Denote 
these levels of utility by vi(X t , L u , L 2 y, k), v 2 (% t , L u , L 2 y k), and 
VQ0( t , Li,,, L 2 y, k). In Section III I will go on to analyze the labor supply 
decision the worker would want to make between options 1, 2, and 3. 
Hypothesizing different values for L u or L 2it will change the vector of 
relative prices and alter the choice the worker would want to make 
between these options. General equilibrium will then be characterized 
by those values of L u and L 2t , for which the assumed number of 
workers (Lu and L 2j ) would indeed turn out to seek employment in 
the respective sectors. 

In the remainder of Section II, I show that a parametric spec¬ 
ification of preferences and production technology exists for which 
the functions v 2 (-), and v 0 (-) admit the following representations: 

Uj (Xi, Lu, L 2 y k) — a(Xt, L u , L 2 y, k) + u\(X„ L\ t> L 2 j), (1) 

v s$tf Li_t, L 2 y k) — o(X„ Lu, L 2 y, k) + u 2 (X t , L\j, L 2l ), (2) 

"o(X t , L i L 2 j , k ) ■— a(X t , L\ jj L 2 y, A) + w. (3) 

Note that the term a(-) is common to all three expressions. This is the 
level of utility the consumer receives independently of his personal 
employment decision. Equations (l)-(3) claim that the single-period 
utility (i = 1, 2, 0) is additively separable between this common 
component a(-) and a term u,(-) that is a monotonic function of X it Lu, 
and L 2j . The relative gain in utility from working in sector 1, u j(-). is 
monotonically increasing in X t : bigger values of X, improve the de¬ 
mand for sector 1 output. The function «i(-) is monotonically decreas¬ 
ing in L u : the greater the number of other people who are already 
working in sector 1, the lower the real wage that worker A could 
receive in that sector, and so sector 1 employment becomes less attrac¬ 
tive relative to sector 2. The function u t (-) is monotonically increasing 
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in L^' greater employment in sector 2 increases the economy's pro¬ 
duction of good 2, and this improves the terms of trade and effective 
real wage for sector 1. For similar reasons, the added utility from 
working in sector 2, »$(*), is monotonically increasing in L u and de¬ 
creasing in Lit. The function Ui(-) is also monotonically increasing in 
X, as a_consequence of general equilibrium wealth effects, though 
added X t increases Ui(-) by more than it increases u^*). The term 2 in 
expression (3) corresponds to the marginal utility of leisure. 

The reader who is uninterested in confirming the derivation of 
equations (l)-(3) may choose to skip much of the remainder of this 
section on a first reading. 


A. Production 

Production is carried out by two representative price-taking firms. 
Period t output of sector 1 is governed by the production function 

Y u = F(L u ), (4) 

and similarly for sector 2: 

K 2 , = G(L 2 ,). (5) 

Output of sector 1 is taken as the numeraire; P 2 . t thus denotes the 
price of good 2 in terms of good 1. It turns out that in equilibrium the 
two sectors will in general pay different wages, W u and W 2J , again 
expressed in units of good 1. Both firms are assumed to act as compet¬ 
itive profit maximizers and so set the marginal product of labor equal 
to the product wage: 

F'(L U ) = W u , (6) 

G'(L 2 ,) = (7) 

‘ 2,1 


B. Households 

During period /, household k consumes amounts c i/k) of good l, 
c 2j (k) of good 2, x,(k) of the unproduced good, and h,(k) units of 
leisure. The worker is either employed, h t (k) — 0, or unemployed, 
ht(k) - 1, and seeks to maximize 

ao 

E, £ \ T -'l/(c,, T (A). c 2 , r (k), x r (k), M*». 
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where E t denotes the expectation formed at date t. My parametric 
example employs a nested Cobb-Douglas constant elasticity of sub¬ 
stitution utility function: 

U(cUk), c 2>T (A), *T(*), Km 

* M)] p + ^1. T W3 P } 8/P (c 2 .^)]'- 9 + Q • Kik). { 

In order to model complementarity between x and c\, I specify that p 
< 0, implying that there is less than unit direct elasticity of subsdtu- 
tion between x and cj (but unit elasticity between this composite x - ci 
and c 2 ). 

The household’s income during period t comes from three sources. 
Its labor income is [ 1 - A,(A)] W ir „ where W i t is the wage paid if the 
household works in sector i at date t. The household also owns a share 
4><A) in the total profits ir t ofjhe two firms and a share z(k) in the 
economywide endowment of X,. 6 Thus its single-period budget con¬ 
straint is 7 


ClAk) + P 2 .t<2. t (A) + Px«Xr(k) = [1 - M*)]W,, T + <j>(AK 

+ z(k)P X r X r , T = t, t + 1, . 


(9) 


There is no store of value in this economy, and thus conditional on 
having chosen unemployment or employment in a particular sector i 
at some date t , the optimal choice of c u (A), r^/k), and x t (A) is the 
solution to maximizing (8) subject to (9). p The first-order conditions 
for this problem call for the consumer to spend a constant fraction (1 
- 0) of his income on good 2 and fractions By, and 0(1 - y,) on goods 
X and 1, where y, is a function of the relative price of good X: 

PxMK = 0 • y, ■ {[1 ~ h,(k)]W i4 + <J>(A)tt, + z(A)P x . e X,}, (10) 
cUK - 6 • (1 - y t ) • {[1 - A,(A)]W<, + m-n, + z(k)P*,M (W 
PsAztW = (1 - 0) • {[1 - A,(A)]W,.< + <J>(A)ir f + z(k)P x , t X,}, (12) 


• [*,(A)J P 

[*/(A)3 p + A[c,, ( (A)F ' 


(13) 


6 Thus for A the set of ail households, f A $(k)dk = f A z(k)dk = 1. 

7 Goods Y u Y g , and X are all taken to be nonstorable. A more complete genera) 
equilibrium model would make 4>( A) and z(A) endogenous by allowing trade in the 
claims on these implicit assets. Including an asset market would seem to offer little 
additional insight. For my parametric example, the time separability of the utility 
function and the linearity of utility in wealth (eq. [15] below) would seem to eliminate 
any role for an insurance or contingent claims market. The effects derived below are 
due to the specification of technology and have nothing to do with the consumption- 
multiplier effects of Keynesian models or with motives for intertemporal smoothing of 
consumption and leisure. 

8 Later I will analyze the optimal labor decisions by using these semi-rcduced-form 
expressions. The technique is analogous to concentrating a likelihood function (see 
Koopmans and Hood 1953, pp. 156-57). 
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A little work with (10), (11), and (IS) shows that 

= j _j_ ^i/o-pippM -p) » 

XA 

and hence the notation; y t is the same for all workers k. 

Substituting (13), (10), and (12) into (8) yields the single-period 
indirect utility function 

7?- (e/p) • e 9 - (l - e) 1 ~ 9 • {[i - h t (k)Wu ± Mk)«, + z(k)PxX) 

ne,t * 2,i 

_ t , tv (15) 

+ u • h£k). 


C. Goods Market Equilibrium 

Let A denote the set of all workers. Equilibrium in the markets for Y u 
Y 2 , and X requires 


J[ c u (k)dk * Y u , 

c 2,t(k)dk — Y 2 j, 

f x,{k)dk = X,. 

Any two of these equations determine P 2jt and P x y, the third equation 
can then be derived from the budget constraints of households via a 
familiar application of Walras’s law. Integrating (11) and using the 
first market-clearing condition along with the definition of aggregate 
profits (ir, * K lf , - W u Lu + P 2 , t Y 2 .i - W 2t ,L 2 .i) yields 


Yu = 9(1 - + Vu + P x ,,X t ). 

(16) 

The third market-clearing condition together with integration of (10) 
likewise yields 

PxjX, — 0 • y, • (Fij + P 2J Y 2 , + PxjX,). 

(17) 

Solving (16) and (17) gives the equilibrium prices 


o _ (1 ~ 9)Eu 

2l ' 9(1 - y,)Y 2j ' 

(18) 

r _ yfu 
* ‘ (1 - 7,)*/ 

(19) 
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Using (19), we can then solve (14): 


7i 


I 

T+WJW' 


( 20 ) 


Equations (4)—<7) and (18)—(20) determine the equilibrium quanti- 
ties Y u , Y 2j , W u , W 2x , P 2t , P Xt , and y t as functions of X„ L^, and 
L%, t . The utility of worker k (expression [15]) can then likewise be 
solved as a function of X ft jL, „ and L 2t . In particular, define a to be the 
component of (15) that is independent of the worker’s employment 
decision: 


_ ■ s’ • (i - a) 1 -* ■ + swr&A 

fif, ■ Wj* ' 

Solving out for P x ,t, P 2 , t , yt, and v, as above allows us to write a(X t , L j ^ 
L%x, k). The single-period utility level V\ of a worker who ends up 
finding employment in sector 1 at wage W 1>f = F'(L \ >( ) is then given by 
equation (1), in which 


tti(X„ L u ,L 2J ) * [y(X„L u )}~^ • 6 [1 - y(X„ L u )) ■ X? 

t-e F'ihJ (21) 


x [G(L 2 .,)] 


F(L t , t ) 


and 


y(X„ L u ) 


l 

1 + b[F(L u )/X t f ’ 


Likewise, the utility v 2 if employed in sector 2 at wage W 2>< = G'(L 2 ,t) • 
P 2il is given by equation (2), where 

u 2 (X t , L u , L u ) - [y(X t , LxJft-** • (1 - 0) • Xf 

X [G(L 2 ,,)r° ■ G’(L U ). (ii) 


The utility if unemployed is of course given by equation (3). 

For Cobb-Douglas production functions (F(L,) * L? and G(L 2 ) = 
L 2 ), total differentiation of (20), (21), and (22) yields 

- (e - P )7 • -f- + id - m ■ ^r- 

U 1 X i-2 

dL (23) 

+ {-14 [0(1 - 7) + P7ft} * ~7“ > 


du% 


( 07 ) • -^ + [- l + (1 - 0)0] • ^ + [0t,(l - 7 )] ’ 

( 24 ) 
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Note that for p < 0, «i is decreasing in L\ and increasing in L 2 > 
whereas u 2 is increasing in L t and decreasing in i 2 , as claimed at the 
beginning of this section. Likewise u t and u 2 are both increasing in X, 
though more X tends to make sector 1 employment more attractive 
relative to sector 2. This completes the derivation of equations (1)- 
(3). 

III. General Equilibrium 

Suppose that the economywide endowment X, takes on one of two 
values, M or m (with M > m), according to a Markov process: 

prob[X, = M|X,_, - M} - p, 

probjX, = m\X t -, « M] = 1 - p, 

prob[X, = MlX,_! = to] = 1 - q, 

prob[X, = m|X,_j = m] = q, 

where prob[A|#] denotes the probability of event A conditional on 
event#. An individual worker ( k) is assumed to know these probabili¬ 
ties and the reduced-form functions (1)—(3). His employment deci¬ 
sion is represented by the choice of a sequence with cr T > = 0,1, 

or 2 so as to maximize 

QO 

E, X • v„JX r , Lt.r, L 2 . t ; k) 

r 

-X> 3C 

- E, X • <*(X T> L hJ , L 2 , t ; *) + E t X ■ u a JX r , L Ur , L 2 . T ). 

T “ l T*“< 

The first term on the right-hand side of this expression represents the 
utility the worker will receive no matter what employment decision he 
makes, and it is irrelevant for the analysis of this section. 

My specification of the nature of frictions in the labor market and 
the basic equilibrium methodology is adapted from Lucas and Pres¬ 
cott (1974). I impose a feasibility constraint similar to theirs, which I 
interpret as a restriction imposed by the technology of location and 
training. 9 A worker must precommit at date t to the sector in which he 
will seek employment at date t + 1. Once date t + 1 arrives, the 

9 An alternative interpretation, which would bring this account closer to that in 
Kydland and Prescott (1982), is that relocation of workers requires construction of new 
idiosyncratic physical capital that takes one period to build. The location interpretation 
is perhaps the most natural given my particular specification of the technology. Accord¬ 
ing to my model, idiosyncratic human capital never depreciates as long as one remains 
committed to the same sector, and it depreciates 100 percent as soon as one switches 
sectors. 
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worker roust choose between employment in that precommitted sec¬ 
tor or unemployment. Most important, I specify that the worker can 
switch from one sector to another only by first going through a period 
of unemployment. 

My approach is to conjecture a particular structure for the time 
series {L\j} and {L 2i/ } in general equilibrium. I then show that, if rela¬ 
tive prices in the aggregate economy were those associated with these 
sequences, then an individual worker who lived in such an economy 
would indeed optimally choose a time path such that, when these 
individual paths are aggregated across all workers k, the labor sup¬ 
plied to each sector at each date would equal the hypothesized mag¬ 
nitudes L\j and L 2l . 

The conjectured equilibrium is one in which an individual worker 
would consider switching sectors only when the aggregate endow¬ 
ment of X, changes. If X, = X,_i, any worker will make the same 
sectoral commitment at date t as that individual made at t - 1. The 
macroeconomic equilibrium thus consists of four separate states: state 
1: X/ = M and X,~i = M; state 2: X, - m and X,_ i = M; state 3: X, = 
M and X,_ i = m; state 4: X, = m and X/_ i = m. Each of these states 
(j — 1, 2, 3, 4) implies a particular exogenous value for X(j) and 
equilibrium values for LjiJ) and L 2 (j). For example, if the economy is 
in state 3 at date t, then X, - X(3) - M. 

The magnitudes L\(j) and L 2 (j) for j ~ 1, 2, 3,4 are in principle all 
endogenous. I have found it expositionally simplest just to posit a 
second class of workers who are assumed to have zero marginal utility 
of leisure and who lack an innate attribute necessary to work in sector 
1. These individuals are thus always employed in sector^ 2. Let l, 2 
denote the total number of these immobile workers and L\ the total 
number of mobile workers on which the entire analysis focuses. I then 
will choose exogenous parameters in such a way that everyone who 
could would want to be employed in sector 1 in state 1. It will thus 
turn out that Z-i(I) = L\ and L 2 (l) - L 2 ; Li(j) and L 2 {j) iorj = 2, 3,4 
can all be conveniently parameterized relative to L i and L 2 . Assuming 
the existence of this second class of immobile workers also shortens 
the discussion of the kinds of switching strategies that have to be 
considered in general equilibrium. The primary role of the immobile 
workers is thus to simplify the notation and exposition; they have 
essentially nothing to do with the substantive claims made below. 
Throughout the paper, whenever I refer to the motives of “workers" 
I refer exclusively to the motives of mobile workers. 

Note that, under the conjectured equilibrium, a sufficient statistic 
for describing the current and forecasting all future macroeconomic 
conditions is whether the current state ( s t ) takes on the value 1,2,3, or 
4. If in addition we know the sector (&,*) to which the kth worker is 
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committed for period t (as a consequence of a decision he made at 
t - 1), then we can calculate the maximum expected lifetime utility 
for that individual: 

00 

J(*» £,.*) - max E, £ X T ~' • u a jX„ L,. T( L 2 . T ) (25) 

iO r T,*.£T+l,WT-« T-« 


subject to cr T> * E {0, 

Appendix A establishes that we can limit our analysis to state- 
dependent strategies. That is, a particular worker is following a strat¬ 
egy such as |(A) = (1, 0, 0, 2), which notation conveys that worker A is 
employed in sector 1 whenever the macroeconomy is in state 1, em¬ 
ployed in sector 2 whenever the macroeconomy is in state 4, and 
switching between the two in states 2 and 3. 

We can then arrive at closed-form expressions for equation (25). 
Let pij(n) denote the probability that n periods from today the econ¬ 
omy will be in state j, given that today's state ist. Let P(n) be the 4x4 
matrix whose (i, j)th element is pi,{n). Thus P(0) is the identity matrix 
and 


P(l) = 


P 

0 

P 

0 


(1 ~ p) 
0 

(1 - p) 
0 


0 0 

(1-?) q 

0 0 

(1-4) q 


(26) 


Suppose that the economy is currently in state s, - i. Then the ex¬ 
pected returns to an arbitrary state-dependent strategy | can be writ¬ 
ten as 


V(t s t ) U-, 


4 00 

- X Z /M n >' • %Wj), L\(j), L 2 {j)) 

j •« 1 «■* 0 



(27) 


where L denotes the ;'th element of the vector Thus in steady 
state, 10 


J{s t , £,,*) = max V(|, s t ) 

subject to E (0, 4 Ti *}. Evaluation of (27) requires an expression for 
Xn-c )pij(n) • X". Recall (e.g., Chiang 1980, p. 109) that, for a Markov 
process, P(n) = [P(l)]", and so since 0 £ X < 1, 

1 + X • P(l) + X 2 • P(2) + X s • P(3) + . . . = [1 - X • P(1)J"'. 

10 That is, we consider a worker for whom * represents a decision previously made 
as part of an optimal state-dependent strategy. 
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From (26) one calculates 

[i- x-par 1 - a- 1 - 

o. M 

[A + Xp(l - Xf)l [Ml ~ p)(i ~ Xf)] [X 2 <1 - pm - q)] (X 2 ? <1 - P)] 

l^po - ?>] ((1 - wi - M>] [Ml - ?)d - xp)] [X?(l - xp)] 

[xpa - x 9 )] [Mi - P)d - x 9 )] [(i - xp>(i - x 9 )] [x 2 9 (i - p)] 

[X S P(1 - ?)] [X 2 <1 - p)(l - ?)] [Ml - ?)<1 - Xp)] [A + \ ? (1 - Xp)] 


where A “(1 - X)(l + X - \p - X 9 ). Thus S“«oPi>( n )' X" is the (i,;)tl 
element of (28). 

I now seek to characterize those state-dependent strategies (j tha 
will be adopted in equilibrium. Suppose that the economy is currentl 
in state = 3, and consider a mobile worker who is committed t 
sector 2 (£,,* = 2). (Such a worker may be hypothetical; our ultimat 
purpose is to see whether someone would be willing to follow a stra 
egy that put him in sector 2 in state 3.) The strategies £' = (2, 2, 2,! 
and » (1, 0, 0, 2) are both available to that worker. If 

F(«\ 3) < 1/(|", 3), (2 

then no worker would choose everyone who can will switch frc 
sector 2 to sector 1 each time the economy is in state 3. We c 
calculate a closed-form expression for (29) by using equations (5 
and (28): 

V(f. 3) ~ V(%\ 3) = A~ 1 • {[\p(l - X 9 )][ Ul (X( 1), MD, MD) 

-u 2 (X(l),L,(l).Ml))] ( 

+ (X(l -p)( 1 - \q)][u - u 2 (X(2), M2), M2 
+ [(1 - Xp)(l - X 9 )][u - MX(3), M3), L 2 (2 

Recall that «](•) is decreasing in L\ and increasing in M whe 
MO is increasing in L\ and decreasing in M Consider, therefore 
consequences of the following inequality: 

Xp[uj(Af, L\, M ~ MM, MM) -I- X(1 - p)[u - u 2 (m, M l 
+ (1 - Xp)[5 - ug(M, M L 2 )] > 0. 

If (31) holds, then (30) must be positive for all Mj) - Li at 
Lg(j) 2: Lg. Thus condition (31) is a characterization of the exog< 
parameters that is sufficient to ensure that no mobile worker \ 
ever choose to be continuously employed in sector 2 in equilib 
One can further calculate that the strategy (1, 0,0,0) will don 
(0, 0, 0, 0) for all values of Ml) — Li and Ml) - Lg provide! 

ui(M, L\, Lg) - u > 0. 
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Finally, I assume a condition implying that in state 1, sector 1 employ¬ 
ment is strictly preferred to sector 2 for all values of Li(l) s L\ and 
L 2 (l)fcr 2 : 

ui(Af, L\, L 2 ) — u^M, L|, L 2 ) > 0. (38) 

Throughout the following analysis I assume that parameters are such 
that (31)—(33) hold along with M > m. I show in Appendix B that 
these assumptions ensure that all mobile workers are employed in 
sector 1 in state 1: Li(l) = L\ and L 2 (l) = ZJ 2 . 

The feasible equilibria can then be separated into four cases. In 
case A, unemployment arises because workers in the depressed sector 
are waiting for conditions to improve. In case B, unemployment 
arises because workers choose to switch sectors. Case C involves per¬ 
manent full employment, while case D can exhibit a mixture of both 
waiting-time and sector-switching unemployment. 


Case A 

Suppose that, in addition to (31)-(33), the following inequalities hold: 

U](m, L\, L 2 ) — u < 0, (34) 

\q[u 2 (m, ZJ|, r 2 ) - Uj(ot, r it r 2 )] + X(1 - q)[u - Ui(M,L u L 2 )] 

+ (1 - X< 7 )[S - u\(m, L u L 2 )\ < 0. (35) 

Condition (35) is the mirror image of (31) and implies that when the 
economy is currently in state 2, the strategy (1, 1,1,1) dominates (1,0, 
0,2) for all Li(j) s and all L 2 (j) 2: L 2 . Thus the strategy (1,0, 0, 2) 
would never be adopted by a rational worker in this equilibrium. 
Strategy (2, 0, 0, 1) is of course dominated by (1,0, 0, 1) by (33). With 
switching between sectors thus ruled out, in the case A equilibrium 
the magnitudes for state 3 will be the same as in state 1, just as those 
for state 2 will be the same as in 4. 

Nor can the equilibrium call for full employment in all states. If 
A)(2) and L t (4) were equal to L\, then (34) implies that every mobile 
worker would want to quit sector 1 in states 2 and 4. Employment 
must fall in these states until it equals the value L‘\ defined implicitly 
by 

Ui(m, L\, L 2 ) — u. (36) 

Since ui(m, L\, L 2 ) is continuous and monotonically decreasing in L t 
with Ui(wi, 0, L s ) - «, inequality (34) implies that a unique value of L\ 
exists satisfying (36), and this is the value at which workers are just 
indifferent between (1, 1, 1, 1) and (1,0, 1,0). 

We are thus led to the following candidate equilibrium: some 
mobile workers are following the strategy (1, 1, 1, 1) while others are 
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following)!, 0,1,0). As a result, L 2 (j) = I 2 for) = 1,2,3,4,L!(1) * 
L\($) =* Lj, and Lj(2) — Li(4) = Lf. Workers are indifferent between 
strategies (1, 1, 1, 1) and (1,0, 1,0) in all states of nature and strictly 
prefer these to (1, 0, 0, 2). 

This equilibrium looks just like the classical analysis of unemploy¬ 
ment that arises when the marginal product of labor falls below the 
marginal utility of leisure, with one important exception: nothing 
about the structure of the equilibrium precludes ex post regret. It is 
perfectly possible that the unemployed workers of sector 1 envy those 
who are still employed in sector 2 in the sense that Q < u 2 (X(2), Li(2), 
L 2 (2)) or even perhaps J(s, = 2, = 1) <J(s, = 2, &,* - 2). If an 

unemployed worker were able to trade places instantly and costlessly 
with one of the employed in another sector, he might well wish to do 
so. Given the real costs of doing so, however, he chooses not to, and it 
is in this sense that the labor market clears in this model even though 
the marginal utility of leisure for an unemployed worker may be less 
than the marginal product of labor for some of those who remain 
working. 


Case B 

Suppose instead that inequalities (34) and (35) were reversed: 

Ui(m, r„r 2 ) - u > 0, (37) 

\q[u 2 (m, Lu Z 2 ) ~ u t (m, Zu £ 2 )] + Ml - 9)l« - «i(M, jL,, Z 2 )} 

+ (1 - \q)[u — ui(m, L|, £. 2 )] > 0. (38) 

Under these conditions, if we again (wrongly) hypothesized that 
Li(j) ~ L\ and L?(j) * L 2 in all states), all mobile workers would want 
to quit sector 1 in state 2 because, from (38), (I, 0, 0, 2) would look 
better than (I, 1, 1,1) whenever the economy was in state 2. The level 
of employment Lj in states 2-4 thus must fall (and the level of em¬ 
ployment L 2 in state 4 correspondingly rise) sufficiently far to raise 
the attractiveness of (1, 1, I, 1) and reduce that of (I, 0, 0, 2) until, 
conditional on the economy’s being in state 2 (when the choice be¬ 
tween these two strategies is made), workers are indifferent. Thus the 
equilibrium consists of the strategy (l, 1,1, l) being adopted by some 
workers and (1, 0, 0, 2) adopted by others, with L\( 1) = Z\, L\{2) = 
Li(3) - Li( 4) = Lf,t 2 (l) = 2) - L*(3) = L 2 , andL 2 (4) = L, + L 2 

- L\. Here L? is defined by 

Xq[us(m, Lf, Li + L% — L?) - tt^m.\J\,Y.\ ^ W" ^\W 
+ A(I - q)[n - u,(M, Lf, £ 2 )] + (1 - a)J - J 
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Again inequality (38) together with our assumptions about Ui(-) and 
U 2 ( a ) ensures the existence of a unique value L\ satisfying (39). 

Considering again the reference case L\(j) = and L 2 (j) = for 
all states j, (37) would imply that no one who remained committed 
to sector 1 would ever choose unemployment in the reference case. 
Since decreasing L\ relative to Lj or increasing L 2 relative to only 
increases the superiority of sector 1 jobs over unemployment, no one 
who remains committed to sector 1 is ever unemployed in this equilib¬ 
rium. 

Note again that even though the labor market is in equilibrium, the 
possibility of ex post regret arises. Those workers following (1,0,0,2) 
are relatively better off in state 4 and worse off in state 3 than those 
following (1, 1, 1, 1). Nevertheless, no worker has an incentive to 
make any decision differently because when the choice between these 
strategies was made (in state 2) the worker was just indifferent. 


Case C 

Consider next 

ui(m, L\, L 2 ) — u > 0, (40) 

kq[u 2 (m, L u L 2 ) - Ui(m, L\, L 2 )] + k(l - q)[u - u,(Af, L,, L 2 )] 

+ (1 - M)[C “ «»("». C\,L 2 )] < 0. (41) 

These conditions turn out to imply a full-employment equilibrium: all 
workers choose (1, 1, 1, 1) with L\(j) = L\ and L 2 (j) = L 2 for all j. 
Inequality (40) implies that no worker who was permanently com¬ 
mitted to sector 1 would ever choose unemployment, while (41) im¬ 
plies that (1, 1, 1, 1) dominates (1, 0, 0, 2). 


Case D 

The final possibility is 

u\(m, Li, L 2 ) - u < 0, 

\q[u 2 (m, Li, L 2 ) - Ui(m, L lt L 2 )] + A(J - q)[u - ui(M, L it L 2 )] 

+ (1 - ty)(5 - ui(m, L u £ 2 )] > 0. 

This turns out to be the most complicated case, with a variety of 
mixed employment-unemployment equilibria possible depending on 
additional specification of parameters. Since the qualitative behavior 
of the system and the economic sources of unemployment arising in 
case D are essentially the same as in cases A and B, I omit a detailed 
catalog of the possibilities here. 
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Fig. 1.—Regions A, B, C, and D plotted as functions of high (M) and low (m) levels of 
endowment of X. 


Figure 1 illustrates cases A-D as different regions in (M, m) space. 
The figure was drawn for the following parameter values. Immobile 
workers account for 80 percent of the labor force (L \ - 1, L% - 4), 
while sector 2 output constitutes 70 percent of GNP (8 = 0.3). Choos¬ 
ing 6 = 2 implies that when X - 1.0 and L\ - L\ and L 2 = Ls, the 
faaor X has a dollar share 7*8 = 0.1 of total GNP. An important 
dependence of purchases of good 1 on X is reflected in the choice p = 
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-1.5.1 further used X * 0.95, r\ - ^ - 0.7, and p = q - 0.8. The 
marginal value of leisure was set equal to the utility level for workers 
employed in sector 1 when X =» 0.65 (u = m^O. 65, 1, 4)). These 
parameter values generate all four cases A-D for relatively modest 
variation in X. u 

IV. On the Nature of Unemployment 

All markets clear in this model. There is thus a sense in which any 
unemployment that arises in this model is strictly voluntary. One 
could of course explore the kinds of phenomena studied here in a 
sticky-price framework, though I have not attempted that exercise. 
Instead I would like to discuss in this section the extent to which the 
neoclassical view of the labor market explored here might be consis¬ 
tent with other widely held perceptions about the nature of aggregate 
fluctuations in unemployment. 12 

One important stumbling block in accepting a theory of voluntary 
unemployment is its implication in the minds of many economists that 
the worker in some sense enjoys or prefers being unemployed. The 
neoclassical position has sometimes been parodied as explaining re¬ 
cessions through “a contagious attack of laziness” or ridiculed on the 
grounds that if recessions represent increased consumption of lei¬ 
sure, we should see booming sales of recreational vehicles during 
hard times. Such criticisms do not apply to the model presented here, 
for it is quite clear that the worker views the transition from state M, 
in which he is certain to be employed, to state wt, in which he may well 
lose his job, as distinctly undesirable, a development that indeed may 
leave him quite miserable. Of course the worker would prefer to have 
as high a marginal product as he enjoyed in some earlier state. Unem¬ 
ployment in this model is not desired by the worker, but it is the best 
he can do with unfortunate circumstances. Unemployment is clearly 
bad, just not as bad as shining shoes. 

Moreover, I noted above that the unemployment equilibria exam¬ 
ined in this paper also allow the possibility of ex post regret. In the 
case A equilibrium, for example, workers following the (1,0, 1,0) 
strategy will be better off than workers in sector 2 in states 1 and 3 but 


11 The reader should not infer from the narrowness of regions A and B in fig. 1 that 
unemployment of type A or B requires a delicate knife-edge condition. By choosing 0 
to equal i<i(0.60, Li,L 2 ) instead of U|(0.65, L u Z 2 ) as drawn, region B becomes much 
b'Rf?er; by choosing G *= Uj(0.70, Z\,Z 2 ), A would grow. The parameters were chosen to 
show both regions for relatively modest variation in X. Recall further that region I) 
ex j)jWts the same sort of unemployment equilibria as regions A and B. 

2 My thinking on this point has been strongly influenced by Lucas's (1978) eloquent 
essay. 
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possibly worse off in states 2 and 4. Depending on how the world 
turns out, the unemployed can surely end up envying those who 
earlier received the training necessary to have the highest marginal 
product in the current state of the world. The unemployment is 
nevertheless voluntary in the sense that at the time such commitments 
had to be made, the path the worker actually chose was rationally 
expected to yield the highest utility. 

I have modeled this envy as taking place across sectors; an unem¬ 
ployed worker committed to sector 1 would not envy a worker cur¬ 
rently employed in sector 1 in my model. This would seem to be 
largely a matter of how the model was set up: if different workers at 
the same firm have different firm-specific skills or attributes and dif¬ 
ferent histories of employment with that firm, a worker cannot ex 
post costlessly trade his attributes for those of another, and the possi¬ 
bility of envy or regret clearly still arises. 

Another major stumbling block in accepting a theory of voluntary 
unemployment is the identification of variables that could plausibly 
cause a sufficiently large decrease in labor’s marginal product. The 
model considered here highlights that it is not necessary to identify an 
aggregate phenomenon that depresses the marginal product of all 
workers below some reservation level. Instead, any event that hits one 
sector sufficiently hard will, at least in the short run, lead to an in¬ 
crease in the aggregate unemployment rate. 

The “waiting-time” unemployment of case A is also consistent with 
empirical evidence suggesting that a substantial portion of the unem¬ 
ployed are eventually rehircd at their old jobs (see Feldstein 1975; 
Lilien 1980; Katz 1986; Murphy and Topel 1987). Note that the 
expected duration of such unemployment is given by 

(1 - 9 )(1 + 2? + 3 q 2 + 4q 3 + . . .) = j±j 

Nothing in the rational expectations structure of the model prevents 
this from being quite a long period of time. The duration of type B 
unemployment in my model is given by the training or relocating 
delay, which again could plausibly be quite substantial. 13 

V. On the Sources of Business Fluctuations 

In earlier research I reported a statistically significant correlation be¬ 
tween oil price shocks and economic recessions in postwar U.S. data. 


,s Topel and Weiss (in press) argued that if unemployed workers are furthermore 
waiting for the resolution of uncertainty between states 1 and 2, this could add a 
significant additional source of unemployment duration. See also n. 9. 
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These findings have been largely corroborated with evidence from 
other data sets by Burbidge and Harrison (1984), Davis (1984, 1985), 
Loungani (1985, 1986), Santini (1985), and Gisser and Goodwin 
(1986). Postwar oil shocks could not have been predicted statistically 
on the basis of previous behavior of macroeconomic aggregates 
(Hamilton 1983) and moreover can convincingly be traced to specific 
exogenous historical events (Hamilton 1985). Such evidence makes it 
difficult to reject the historical correlation as entirely spurious. 

Let us first ask how large a change between Af and m is necessary for 
our model economy to exhibit an unemployment equilibrium of type 
A. Use (23) to linearize ui(m, L\, La) around log m - log Af: 

u t (m, Li, L 2 ) * Ui(M, Li, L 2 ) + Ui(M, L,, T 2 ) • (0 - p) • y(M, Li) 

• (log m - log Af). 

Thus a first-order approximation to (34) reveals that type A unem¬ 
ployment will arise when 

log Af - log m > -(42) 

(0-p)*y(Af,L 1 )-u 1 (Af,L 1 ,L 2 ) 

Any drop in X of a magnitude greater than (42) will generate unem¬ 
ployment of this type. Once the economy is in region A, one can find 
the contribution of the decrease from Af to m at the margin by differ¬ 
entiating (36) and evaluating at that value m A where u = ui(m A , Lj, 
L 2 ): 

d log Lf = _ (0 - p) -jim*, Li) 

d log m 1 - 0 tj( 1 - *y(tn 4 Li)] - p^ • y{m?, L,) 

Similarly, for case B a linearization of (38) around log m — log Af 
shows that the drop from Af to m is sufficient to generate type B 
unemployment when 

log Af — log m > 

(1 + \ — 2 \q)u — (1 + A - Kq)ui(M,Li,L>2) + (\<y)M 2 (Af,L|,L 2 ) 
[(kg8)u 2 (Af, Li, L 2 ) - (0 - p)ui(Af, Lj, L^J^Af, Li) 

(44) 

Define hi* to be that value of m for which Lf = Lj in equation (39). 
Then the contribution of m at the case B margin would be given by 

= {[-(AOlMmM,,L 2 ) + (0 - ?)«!(»«*,L,,L 2 )]-7(^,1,)} 

+ (* * r„ I 2 ) • {fhr,[l - y(m B , L,)l + - P(1 - 6)]] 
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+ «i (m B , Li> ta) * jl - ~ y(m?,Zi)] (45) 

- pn • y(m?, Zi) + x(-jg-)3(i - 8)} 

+ X(1 - q) • Ui(M, Lu L<t) • {1 - 0T)[1 - y(M, £,)] - prj • y(M, L,)}j. 

From equation (19), y ~ P x X/(Yi + P x X) denotes the dollar value 
of energy as a share of Fi + P x X, and y0 denotes the share of energy 
in total GNP (= P x X/[Y, + P 2 F 2 + P x X]). Expressions (42)-(45) de¬ 
pend in part on the factor share y. For a smaller value of y, a larger 
drop in X is necessary to generate unemployment, and the contribu¬ 
tion of m at the margin is also smaller the lower the value of y. 
However, the effect also depends on p, a large absolute value of which 
would arise when purchases of Fj are closely tied to X. From (42) and 
(44), the larger the absolute value of p, the smaller is the change in X 
necessary to produce unemployment, and from (43) and (45), the 
larger would be the drop in Lf or L? at the margin. Indeed, consider 
the limiting case as p -* - The utility function (8) then becomes 

{minW*), c u (A)]f[c 2 .,(A)] 1 *' ft + u-h,(k), 

and each consumer will choose x<(A) = c u {k) with X, = Y u in equilib¬ 
rium. 14 For the slightest decrease in_m below M, y would increase to 
one if L\ stayed at L.\, and so U\{m, L\, L*) - 0. This means that (34) 
would hold, and unemployment would result from the slightest de¬ 
crease in m below M no matter what the value of y. Equally clearly, in 
the limit as p —* ~ rx , equilibrium would require employment in state 2 
to equal that value L* satisfying F(L*) = m from which 

d log L* __ _1_ 
d log m Tj 

at the margin, again no matter how small y. 

The total effect on the economy is thus limited not by y, the dollar 
share of energy, but rather by 0, the dollar share of products whose 
use depends critically on energy. The intuition for this result is 
straightforward. If X is truly indispensable to the consumer in being 
able to use good Ki (p = -«), then a 10 percent reduction in X will 
require a 10 percent reduction in Y i in equilibrium. Since Y j = LJ 1 , 
this means a (lZ-q) • 10 percent reduction in sector 1 employment. 
Economists usually think of the associated displaced workers as being 
paid their marginal product both in the affected sector and in alterna¬ 
tive jobs; hence the dollar value of output lost by sector 1 is supposed 

14 Obviously to include this case we must normalize units so that Af * F(L)V, *« 
Arrow et al. (1961, p. 231) on this point. 
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to be made up by an equivalent dollar value increase elsewhere as the 
workers shift to alternative jobs. However, if significant costs or delays 
are associated with finding new jobs, then the increase in Y% may 
follow the drop in Fj only after a significant lag. Indeed, it may be the 
case that workers perceive it not to be in their interests to look for 
employment in sector 2 at all, instead choosing to remain unemployed 
in the hopes of a return to prosperity for sector 1. Thus the 
difficulties in relocating specialized labor could explain why seem¬ 
ingly small supply disruptions can have fairly large effects on the 
economy as a whole. 


Appendix A 


Here I show that if the macroeconomy follows the four-state Markov process 
described in cases A, B, and C, then any individual worker’s optimal strategy 
consists of a sector to which he will always be committed whenever the econ¬ 
omy is in state 1 and a second sector whenever the economy is in state 4; if 
these sectors are different, the worker is of course switching between the two 
in states 2 and 3. 

The value function (25) is characterized by the recursion 


J(s„ £,.*) = ^ 

maxjJ^Wr,), Li(s t ), L 2 (s,)) + K^P[s l+ , = j\s,] • J(s, + i = j. £, + i_* 
4" 1 = '_/{■*(+1 ~ j, £j+l> “ 

[w + = ;'|r,] -J(s l+ i = j, £, + u * £,.*)jj- 


w]. 

(Al) 


Observation 1 . Suppose that worker k chooses to shift from sector £ to 
sector £' at some date when the technology state was M (alternatively, m). 
Then the worker would never choose to shift from £' back to £ at any future 
date when the technology state was again Af (alternatively, m). 

Proof. We are told that the worker chose to shift to £' in some period t with 
X, = Af, say. The worker evidently preferred this to remaining committed to £ 
and just taking a period of unemployment: 

4 

u + K^PlS'+i = j\X, = Af] -y(s, + , = j, £') 

'"i _ -> <A2) 

a + = j\ x < = M] ’7(^+1 = b t) • 

j-i J 

But from (26), the transition probabilities are the same in any future period t’ 
for which X,' = Af as they were in t. Replacing t in expression (A2) with t' 
reveals that the individual would not want to cycle back to the original sector £ 
for period t'\ he is at least as well off remaining committed to sector £' and 
just taking a period of unemployment. Q.F..D. 
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Observation 2. No individual will want to shift from sector 1 to sector 2 
during state 4. 

Proof. Consider two cases, (a) When faced with s, * 2 and = 1, the 
worker chooses to shift from sector 1 to sector 2. Then from observation 1, 
the worker will never choose to shift from 2 to 1 during any subsequent state 
4. But since any given state 4 must be preceded by either state 2 or state 4, the 
worker under case a would never be in sector 1 in state 4; thus for this case 
the observation holds trivially. ( b ) When faced with s, = 2 and hj, - 1, the 
worker chooses not to shift from sector 1 to sector 2. From this we can infer 
that 


max u,(X(2), L,(2), £*(2)) 


+ *5>h*. “ft 5 ' = 2] = j, {,+= 1)1, 

;-t J 

[ a + * -/I* = 2 J '/(*<+1 * j* £»+u = i)jj 

- [« + , = j\ St = 2 1 - 3> k+u = 2)j. 


Recall that for equilibria A-C, L,(2) = L,(4) and L 2 (2) s E 2 (4). Also X(2) 
= X(4). Thus ui(X(2), L,(2), L 2 (2)) s u,(X(4), 1,(4), L 2 (4)). It follows from 
P[s (+ , = j\s, =* 2] = P[Sf +, = j\s t - - 4] that for any date t' in which the econ¬ 
omy is in state 4, 

max ||u,(X(4), L,(4), L 2 (4)) 

+ aXp[v + 1 = >|v = 4]'/(s,- +1 = j, £,' + ),* = 1)1 

/« 4 J 

[“ + = jW = 4] -J(s l + i = ;',&•+m = l)j| 

a [ a + + 1 ~ 3\ s f ~ 4] ’/(v + i = j> lr+ut * 2)j, 


which shows that if the worker did not shift in t, he would not shift in t'. 
Q.E.D. 

Observation 5. No individual would want to shift from sector 2 to sector 1 
during state 1. 

Proof. The proof precisely parallels that of observation 2, using the fact that 
«2(X(3), 1,(3), L t (i» s « 2 (X(1), L,(l), I 2 (l)>. Q.E.D. 

Observations 2 and 3 rule out the most interesting switches that might be 
hypothesized to occur during states 1 and 4. For completeness, consider the 
last two cases: (a) someone who switches from sector 1 to sector 2 in state 1 
(when sector 2 is in fact the less desirable of the two) and (b) someone who 
switches from sector 2 to sector 1 in state 4 (again switching against the 
natural ordering). These can be ruled out as well. The proof is a bit involved 
and unilluminating, and for this reason I offer only a very brief sketch for the 
case of equilibrium B. In this equilibrium, U2(X(4), L,(4), L 2 (4)) > u,(X(4), 
1,(4), I 2 (4)), and Uj(X(/), £.,(/), /*(;)) > S for/ = 1,2,3,4 and i * 1,2. From 
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these one can establish that any hypothesized non-state-dependent strategy 
associated with a or b is dominated by the state-dependent (but infeasible) 
strategy (2,2,1,1). One can then use (31) to show that V([l, 1.1,1], 1) > V([2, 
2,1,11< 1)> establishing that no worker would ever be willing to quit the sector 
1 job during state 1. 


Appendix B 

Here I show that under assumptions (31)—(33) and Af > m, no worker would 
choose a permanent commitment to sector 2. Note first that for any of equilib¬ 
ria A-C, if one is to be employed in sector 2, state 1 is the most favored, 
followed by 3, 2, and then 4: 

u 2 (X(l), £|(1), L 2 (l)) a uj(X(3), L,(3), L 2 <3)) 
a u 2 (X(2), L,<2), L 2 (2)) a uj(X(4), L,(4). L 2 (4)). 

The particular permanent sector 2 strategy that is optimal for the worker 
depends on where 0 would be placed in this chain of inequalities, but clearly 
the optimal choice must be one of (0,0,0,0), (2,0,0,0), (2,0,2,0), (2,2,2,0), 
or (2, 2, 2, 2). 

Now (0, 0, 0, 0) is dominated by (1, 0, 0, 0) by assumption (32). Likewise, 
(33) implies that (2, 0, 0, 0) is dominated by (1, 0, 0, 0) and that (2, 0. 2,0) is 
dominated by (1,0,1,0). We saw in the text that (2,2,2,2) is dominated by (1, 
0,0,2), leaving (2,2,2,0) as the sole element to be considered. But in the case 
B equilibrium, this is dominated by (2, 2, 2, 2), whereas in the case A equilib¬ 
rium, it is dominated by one of (2, 2, 2, 2) or (2, 0. 2, 0), both of which were 
dismissed above. 
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Intergenerational Flows of Time and Goods: 
Consequences of Slowing Population Growth 
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It is feared that low fertility and older age distributions in the devel¬ 
oped countries might cause lower life cycle consumption because of 
the increased pension and health cost burden. The theoretical litera¬ 
ture on intergenerational transfers has addressed this question but 
has considered only the consumption of market goods and has made 
no serious empirical attempt to measure the theoretical concepts 
necessary to assess the problem. This paper develops a theoretical 
model of intergenerational transfers incorporating time use. With 
the aid of time budget and consumer expenditure surveys, empirical 
estimates of the age profiles of various types of time and goods 
consumption are presented, and we conclude that (1) the net direc¬ 
tion of intergenerational transfers is from younger to older ages; 
(2) under the golden-rule assumption, these transfers largely consti¬ 
tute an externality to childbearing; and (3) they are not large enough 
to offset the capital dilution effect that would result from higher 
fertility and more rapid population growth. 


I. Introduction 

Fertility is now below replacement levels in almost all developed coun¬ 
tries. In combination with recent declines in mortality, particularly at 
older ages, this has led to widespread concern about the perceived 
problems of an aging and aged population and, in particular, with the 
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problem of old‘age support. Such concerns have been translated into 
pronatalist policies in many countries. 

Policy discussions of aging often restrict attention not only to 
public-sector transfers but within these to pension-type transfers. 
Such a treatment gives a very partial and distorted view of the costs 
and benefits of higher fertility and is an entirely unsuitable basis for 
policy decisions concerning fertility. 

A more comprehensive view would examine the effects of fertility 
and age distribution on intergenerational public transfers at all levels 
of government (including, e.g., health and education as well as pen¬ 
sions), on intergenerational market transactions and asset accumula¬ 
tion, and on intergenerational transfers within the family. In addition 
to considering intergenerational transfers, analysis of the costs and 
benefits to higher fertility should include the direct satisfaction par¬ 
ents receive from their children. This additional consideration, how¬ 
ever, implies that fertility is endogenous, and we are led to consider 
why fertility might be at a suboptimal level and wherein lie the exter¬ 
nalities to childbearing. 

A growing literature on models of economic growth with overlap¬ 
ping generations, stemming from the seminal works of Samuelson 
and Diamond, is potentially germane to these issues. While much of 
this literature is too highly stylized to guide or interpret actual empir¬ 
ical work (common assumptions are only two age groups, no families, 
no public sector, or no capital), papers by Arthur and McNicoll (1978) 
and by Willis (1988) helpfully incorporate transfers from several 
sources and demographic realism. 

This paper attempts an empirical assessment of intergenerational 
transfers and the gains and losses to higher fertility. Lee (1985) devel¬ 
oped the theoretical and accounting apparatus for estimating the 
family transfers, while allowing marginal costs of children to differ 
from average costs. This paper extends that work by incorporating 
the allocation of time. 

Previous studies have focusted only on physical goods. Yet their 
production occupies a relatively small portion of the total life cycle 
time at an individual’s disposal. For example, if everyone lived to the 
age of 70 and worked at market tasks from the age of 20 to 65, with 
men working 40 hours per week and women 20, then time devoted to 
market work would account for only about 11.5 percent of the total 
time budget. Even when 10 hours per day as a basic minimum neces¬ 
sary for biological maintenance (eating, sleeping, etc.) are deducted, 
the fraction of market time out of the remainder is still only about 20 
percent of this “discretionary” time. Furthermore, children are time 
intensive, so assessment of market costs alone might bias the esti¬ 
mated direction of transfers. 
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Might, then, this exclusive focus on the intergenerational flows of 
market goods conceal some other important aspects of intergenera* 
tional resource flows more broadly construed? While time itself can¬ 
not literally be transferred between individuals, individuals may use 
their time to produce home time services, just as their market work 
enables them to buy market goods and services. Home-produced ser¬ 
vices can then be transferred to other persons. For example, meals 
prepared by one person are consumed by other members of the 
household. Moreover, the amount of time spent in home production 
increases when couples have children, in part because of time expen¬ 
diture in directly caring for children, so that the parents’ leisure and 
sleep are reduced. We hope to capture these sorts of transfers by 
including time in our analysis. 

After a brief discussion of theoretical background, the first part of 
the paper develops an accounting framework for analyzing the 
steady-state life cycle resource implications of changed fertility and 
the measurement of the total {family, market, and government) social 
transfer. The analysis is guided by the assumption of life cycle utility 
maximization regarding labor supply, consumption, home produc¬ 
tion, leisure, and schooling. Family transfers take place within the 
household and are subject to economies of household scale. Market 
and government transfers are made between households and are 
mainly unaffected by economies of household scale. The second por¬ 
tion of the paper derives empirical estimates of the within-household 
and between-household components of the total transfer. Conclu¬ 
sions here are, we believe, relatively robust to model specification. 

The third part of the paper briefly considers the wider generality of 
the results and their welfare implications. These welfare implications 
depend on the treatment of children in the parental utility function 
(“altruism” vs. direct satisfaction from children) and are therefore 
model dependent. 


II. Theoretical Background 

Every theoretical analysis of in ter genera tional transfers starts with a 
social budget constraint defining the feasible allocations of age- 
specific consumption and endowments (or labor effort), and some¬ 
times including investment, for a given population age distribution. 
In stationary allocations the proportional population age distribution 
and the economic age profiles are fixed over time, so the social budget 
constraint takes on a particularly simple form. For a given stationary 
feasible allocation, any discount rate r for which an associated life 
cycle budget constraint is satisfied defines a “competitive equilibrium” 
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(Gale 1973). Gale demonstrated that every feasible stationary alloca¬ 
tion has at least two competitive equilibria. One of these, called the 
“golden rule,” has r = » (where n is the population growth rate); this 
is Samuelson’s (1958) biological interest rate theorem. 

The infinite number of stationary allocations satisfying the social 
and life cycle budget constraints will differ in important ways. Each 
individual must, by assumption, die with zero assets (all bequests are 
treated as occurring inter vivo, though the aggregation of individuals 
of different ages may yield a net credit or debt). Gale (1973) in¬ 
troduced the terminology “Samuelson” for allocations with aggregate 
net credit, “classical” for aggregate net debt, and “balanced” for zero 
credit. He also pointed out that every stationary competitive equilib¬ 
rium must be either golden rule (mentioned above) or balanced. 
When the economy is golden-rule Samuelson, then life cycle con¬ 
sumption will be higher if fertility and the growth rate are increased; 
the reverse is true in the classical case. 

Willis (1988) shows how market, family, and public transfers can all 
be incorporated in the model. One approach to assessing the mag¬ 
nitude and direction of intergenerational transfers would be to at¬ 
tempt to evaluate every intergenerational transaction, explicit or 
implicit, through each of these three channels. This would be an 
extremely difficult undertaking. Given the stationary allocation as¬ 
sumption, however, this approach is unnecessary. 1 Willis demon¬ 
strates that in the golden-rule case, the total net social transfer equals 
the derivative of life cycle consumption with respect to the population 
growth rate, which Arthur and McNicoll (1978) had shown to be a 
function of the average ages of consumption and labor supply and of 
the capital/consumption ratio. Therefore, our empirical analysis deals 
with these quantities. Of course, the real world is far from stationary, 
which weakens our results. 

Gale (1973) originally applied the terms “Samuelson” and “classi¬ 
cal” to the aggregate credit balance for market transactions. By exten¬ 
sion, we will here use the terms to refer to the sign of the net sum of 
within-household, public, and market transfers. Used thus, these 
terms have no implications for the competitive equilibrium interest 
rate. But if we ignore, for the moment, any utility from current and 
future children and their descendants, Samuelson economies in this 
broader sense benefit from higher fertility and more rapid popula- 


1 For some purposes, it is irrelevant whether transfers take place through the market, 
the family, or the government; the distinction has important implications for incentives 
and hence for behavior, but the observed profile of net transfers by age is what it is, no 
matter how it came to be or by what structure of incentives it is supported. 



6*2 JOURNAL OF POLITICAL ECONOMY 

tion growth, while classical economies benefit from slower growth. In 
the Samuelson case, capital dilution tends to offset the effect, and in 
the classical case to strengthen it. 

So far, the discussion is equally applicable to virtually all models 
with overlapping generations. Once we make fertility endogenous by 
adding assumptions about utility from children, some results become 
model specific. Eckstein and Wolpin (1985) consider a nonaitruistic 
utility function in which parents care about their own consumption, 
about the number of children they have, and possibly about their 
children’s consumption, but not about their children’s utility. They 
show that economies may tend toward a golden-rule Samuelson equi¬ 
librium that is suboptimal, as a result of positive externalities to chil¬ 
dren. Their specification is close to the one we introduce later. In 
contrast, Pazner and Razin (1980) show that an altruistic utility func¬ 
tion results in a classical (efficient) balanced equilibrium that is op¬ 
timal as long as there are no public transfers distorting child costs (see 
Willis [1987] for possible externalities when public transfers occur). 

Our strategy here is to determine empirically whether the recent 
age-specific allocation in the United States is, at the margin, golden- 
rule Samuelson, classical, or balanced. We also evaluate the actual 
amount of the total social transfer and isolate the nonfamily part of it, 
which corresponds, under the nonaitruistic model, to the externality 
to a birth. 

III. Individual Utility over the Life Cycle 

Each individual is assumed to derive satisfaction from leisure, l, from 
the consumption of market commodities, c, and from the consump¬ 
tion of home-produced services, h. Each of these is actually a vector of 
consumption distributed by age, x, over the life cycle or, equivalently, 
could be viewed as a function of age. Life cycle satisfaction, U, is a 
function of these functions: 


U - U(c, l, h). (1) 

Leisure is a residual category after time spent in market work, home 
work, and schooling is deleted. “Market work” is considered to in¬ 
clude travel to and from work and eating at work. “Home work” 
includes cooking, cleaning, child care, shopping, decorating, and so 
on. “Schooling time" includes any time spent with the intent of subse¬ 
quently realizing higher earnings and greater general efficiency. Mar¬ 
ket goods and home services can be transferred from one person to 
another and, thus, from one generation to another; this is not true 
either of school time or, more important, of leisure. 
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IV. The Social Budget Constraint 

Consider a “consumption-loan” economy of the sort described by 
Samuelson (1958). The economy is characterized by the assumption 
that there are no capital goods, and storage is impossible; therefore, 
no more is produced than is consumed in any period, and savings are 
always zero. The assumption of a consumption-loan economy sim¬ 
plifies the analysis, but the conclusions are easily extended to a neo¬ 
classical growth model with capital, particularly in the golden-rule 
case (see Lee 1980; Eckstein and Wolpin 1985). While aggregate out¬ 
put must equal aggregate expenditure at any instant, this is not true 
for each individual since borrowing and lending, or transfers, are 
possible. Nor is it true for age groups; the average individual in some 
age groups may run a net deficit, and in others a surplus. 2 

To write the budget for an individual, age group, or society, it is 
necessary to evaluate time in the same units as goods, which we will do 
through use of production efficiency units. These must be age specific 
since individuals at different ages are differentially efficient in both 
home and market production. Leisure, though, is consumed in simple 
units of time; an hour of leisure is enjoyed no more by an efficient 
person than by an inefficient one, ceteris paribus, although of course 
it has a higher cost for the efficient person. Likewise, schooling time is 
measured in simple time units (although a case could be made to do 
otherwise). 

The efficiency of persons aged x in production is denoted by e(x). 
This is an average across sex, and it is assumed that efficiencies in 
home and market work do not differ. 3 The efficiency at age x will 
depend on schooling inputs at earlier ages: 

e(x) = Ffo(u), j m (u), 0 < u < *], (2) 

where s t (u) = own-time inputs to schooling and s m (u) — market inputs 
to schooling. (Efficiency should also depend on experience and on 
daily hours worked, but we ignore these complications.) An efficiency 


2 When an age group, in total, consumes more or less than its labor income, we say 
that an “intergenerational transfer" has taken place. This transfer could, however, 
reflect market transactions (borrowing or lending, or the purchase or sale of physical 
assets, if these exist), as well as familial or governmental transfers in the common 
language sense of the term. In the steady-state, golden-rule case, which we will discuss 
later, it makes little difference whether such intergenerational flows pass through mar¬ 
kets or are arranged by other institutions. 

3 In our calculations of intergenerational transfers, we experimented with other 
assumptions concerning efficiency levels in home production and differences between 
women and men. Our results are robust to alternative specifications, as reported in n. 8. 
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unit of market time is converted into units of the consumption good 
at the daily wage rate w, which is 24 times the hourly wage rate. 4 

We can now write down the daily budget balance for an average 
person aged x, which is equivalent to the net transfer at age x. This 
person starts with the value of one day, which is v> • e(x). From this is 
subtracted the value of leisure, school time, home services consumed, 
and costs of market consumption and schooling, as follows: 

T(x) = a>{[ 1 - l(x) - s,(x)] • e(x) - h(x)} - c(x) - s„(x), (3) 

where T(x) = average net transfer, l(x) = fraction of day devoted to 
leisure, h(x) - consumption of home time services, measured in terms 
of efficiency units, and c(x) — consumption of market goods. Suppose 
now that the population is stable, growing at rate n, with survival 
probabilities from birth to age x of p(x). Then the population at age x 
is B(t)e~ nM p(x), where B(t) is births at time t. The integral of all 
age group budget balances (general output, general consumption), 
weighted by the relative size of each age group, must be zero by the 
consumption-loan assumption (ignoring the possibility of an inequal¬ 
ity, arising from wasted output). Thus the social budget constraint, 
after the irrelevant B(t) factor is dropped, is 

fe~ nx p(x) ■ (w{[l - l(x) - r,(x)J • e(x) - A(x)} - c(x) - s m (x))dx = 0. 

(4) 

We could also write this as the requirement that full social income, R, 
equal full social expenditures on time, T, and market goods, Af: R - 
T + Af, or 

wf e~ n *p(x)e(x)dx = Je~” x p(x) • w{[l(x) + s t (x)J-e(x) + h(x)}dx 
+ fe~ nx p(x)[c(x) + J m (x)]<£c. 

Note that (5) could also be viewed as expressing an individual budget 
constraint over the life cycle, discounted at rate n, and weighted by 
the probability of survival to age x, p(x). Any life cycle age schedules 
(c, l, h, s„ s m ) that satisfy (5) are socially feasible and will exactly exhaust 
social resources. 

V. Changing Population Growth Rates and the 
Social Budget Constraint 

If, because of a long-term difference in fertility or mortality, the age 
distribution of the population were slightly altered, then the relative 

4 There may be some objection to evaluating all nonmarket time at the wage rate. We 
report empirical calculations of intergenerational transfers both including and exclud¬ 
ing leisure time, thus evaluating only time spent at school and in home production at 
the wage rate. 
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sizes of age groups with surpluses or deficits would be changed, and 
the consumption possibilities facing the population would likewise be 
slightly altered. The implications of such changes are far easier to 
analyze across steady states than in the short run, and steady-state 
changes are larger and somewhat easier to analyze if they arise from a 
difference in fertility rather than one in mortality (see Arthur 1981; 
Preston 1982); therefore, we examine the consequences of differ¬ 
ences in fertility, assuming that the populations compared are stable. 

The same analysis can be interpreted as revealing how an individ¬ 
ual’s life cycle opportunities are altered by living in a population with 
a different age distribution (and a different n), much as we could ask 
whether an individual loses or gains from a change in the interest 
rate. In both problems the answer turns on whether the individual, on 
average, consumes earlier or later than producing; individuals who 
on average consume before producing will benefit from lower interest 
rates and slower population growth. 

It must also be emphasized that paralleling these intergenerational 
flows of resources are intergenerational flows of satisfaction arising 
from direct enjoyment of one another: parents of children, children 
of grandparents and of uncles and aunts, and so on. A change in 
fertility alters all these flows, perhaps in very important ways (in a 
one-child-family regime, there is no enjoyment of siblings, cousins, 
aunts, or uncles, but on the other hand, children do not need to 
compete for parental or grandparental time and affection); here we 
ignore most of these flows of satisfaction, although the utility parents 
derive from children will be considered below. 

To analyze the implications of a difference in growth rate we may 
differentiate the natural logs of each side of equation (5); let us begin 
with the left side: 


d In R 
dn 


de(x) 


»/-»-»«(«)& ( v>S'- nx p(x)~± L dx 


R 


R 


~A/r + 


w /f nx p{x) ~~~l~ dx 


( 6 ) 


The first term on the right of this equation is just the average age of 
an efficiency unit in the population, denoted A F . It depends not only 
on the life cycle distribution of efficiency but also on the population 
age distribution. 

The second term on the right expresses any change in individual 
efficiency levels associated with a change in the growth rate. Following 
equation (2), such changes would arise from changes in schooling 
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investments at earlier ages and, more specifically, by 


de(x) ^ |* de(u) m ds,(u ) + de(u) _ as»(u) l du 
dn J 0 ds,(u) dn Ss m (u) dn J 


The derivative of the log of the right side of (5) is complicated but will 
turn out to have a simple interpretation: 


d ln(T + M) 
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The first integral is simply the weighted sum of average ages of the 
various forms of consumption. The weights are the shares in total 
social (or discounted individual) consumption of each item. Letting b L 
denote the share of “leisure,” we would have, for example, 

. wje~ nx p(x)e(x)l(x)dx 


This first integral in (8) can thus be reexpressed as 

Ai. • bi. + A h - b H + Ac. • be + A s _' b s _ + A Sl • bs r (9) 

But this sum is just a decomposition of the average age of resource 
use, let us say Atm for time and market uses. 

The remainder of the right-hand side of (8) is the present value of 
changes in all the life cycle profiles of leisure, goods consumption, 
school time, home production time, and efficiency, divided by R. It 
expresses a proportional change in the social resources available per 
person or, alternatively, can be interpreted as a proportional change 
in the present value of individual life cycle resources. Thus if this 
integral is positive, it represents an improvement in attainable social 
or individual utility. Let Q represent the portion of the integral in¬ 
volving the arguments of the utility function (1), that is, the propor¬ 
tionate change in full life cycle income excluding schooling. Then 
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combining equations (6), (7), and (9), we find, after some manipula¬ 
tion, 


w!e-**p(x)[ 1 - l(x) - ,,(*)] igld* 

Q * Atm - A e -^-—- 

• e(x) + <10) 

+ _ . 

Now note that the first integral gives the present value of expected life 
cycle gains in productive activities resulting from changes in e(x), 
arising from changes in j,(x) and s n (x) as described in equation (7). 
The second integral is the present value of the costs of the changes in 
schooling investments. Under optimal life cycle, or social planning, 
these will be equal and will cancel as will be seen in the next section. 
For the moment, we will call this quantity e and note that it will be zero 
if schooling investments are unaltered when n changes. Thus 

Q - Atm — Ae + e. (11) 

If the weighted sum of average ages of consumption ( A TM ) is larger 
than the average age of efficiency (A*), Q_ will be positive and higher 
fertility would permit more of c, l, and h over the life cycle; when Q is 
negative, lower fertility is beneficial in this sense. As an example, 
suppose that the average age of general consumption is 45, while the 
average age of efficiency is 50, so that Q = - 5. A reduction in n by 1 
percent per year, or .01, would permit an increase in life cycle expen¬ 
diture by 5 percent (.05 = -.01 x —5). 


VI. Utility Maximization 

The preceding discussion was solely concerned with the social budget 
constraint and the way it is affected by the population growth rate. To 
interpret these changes, it will be helpful to consider the behavior of 
an individual who maximizes life cycle utility from birth or who is in 
the hands of parents who make decisions for the individual consistent 
with such maximization. We can then derive an expression for the 
change in the individual’s utility arising from a change in the popula¬ 
tion’s growth rate. 

Consider an individual faced with the budget constraint of equation 
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(5), who seeks to maximize U(c, h, l). First-order conditions (assuming 
U is differentiable) are readily shown to be 


dU 

9l(x) 

dU 

9h(x) 

dU 

dc(x) 


M*)« nx p(*)> 

Xuw~"*jb(x), 

ke~ nx p(x), 


( 12 ) 


where X is the marginal value of resources. If optimization extends to 
schooling decisions as well, then the following are added: 


e~ n *p(x)e(x)= e~ nu p(u)[ 1 - f(u) - *(«)] du, 

1 a L < 13 > 

e'**p(x) = j «—*(»)[ 1 - t(u) - i( ( u )]ggdu. 

Now consider the effect of an exogenous change in n on life cycle 
utility, U: 
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Substituting in the first-order conditions from (12), we get 
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Now, noting that X is the marginal value of resources, dUldR, we ca 
substitute from (15) into (8) and (11) to get 
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As noted above, the two integrals represent the costs and retu: 
from altering investment in education and therefore should be eq 
when plans are optimal. This can be seen formally by substitut 
equations (4) and (13) and rewriting the double integral. This est 
fishes the following fundamental result: 
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Later in the paper we will quantify the difference in average ag< 
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VII. The Household Setting 

Life for an adult under a regime of higher fertility differs in two 
important ways. First, there are effects arising within the household 
since each will contain more children, and therefore there will be 
somewhat less consumption for each adult. Consumption will decline 
less than proportionately to the increase in household size, however, 
not just because on average children consume less than adults, but 
also because household scale economies, public goods, and hand-me- 
downs make the cost of an incremental child less than that of an 
average child. Second, there are between-household effects arising 
from the higher proportion of households with younger heads. Be¬ 
cause of this, each younger head will be supporting fewer households 
with elderly retired heads, interest rates may be higher, publicly 
borne schooling and health costs will be different, and so on. These 
between-household effects arise from a changed age distribution of 
households, for a fixed average consumption and production pattern 
associated with each household of a given age. The within-household 
effects, by contrast, arise from a change in the consumption and 
production patterns for households of a given age, with the initial 
distribution of households by age of head held fixed. 

These considerations entail a revision of the analysis of the previous 
sections, which, while it is formally correct, suffers from two impor¬ 
tant drawbacks. First, when the first-order conditions for utility max¬ 
imization are applied to the entire life cycle, it is implicitly assumed 
that children maximize their utility from birth or that their parents 
act so as to do so for them. In fact, parents choose the consumption of 
goods and home services for their children, and, to a considerable 
extent, the educational investments and other choices for children are 
constrained by their parents’ budgets; parents cannot generally bor¬ 
row in their children’s names. Second, because of the lower incremen¬ 
tal costs of children in a household, the welfare interpretation of 
changes in consumption expenditures when fertility changes is not 
clear; even if life cycle consumption in dollars were to decline, its 
utility value could increase. 

In order to distinguish the between- and within-household effects, 
the model must be somewhat altered. The extended model is devel¬ 
oped in I,ee (1980, 1985), and the main results are as follows. (1) To 
analyze the between-household effects, we use the mathematical ap¬ 
paratus developed above but apply it to households by age of head 
rather than to individuals. Household expenditures and time use do 
not need to be allocated to any particular household member for this 
purpose, so simple aggregate household consumption data, such as 
those frequently gathered in surveys, can be used. This is a major 
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advantage of the approach. (2) To examine the within-household 
effects, we need a measure of the net cost of an incremental child (in 
the neighborhood of the initial fertility). The appropriate measure of 
the net cost of the child (NCC) is the addition to household income 
that would be necessary to restore the initial level of consumption of 
the household head(s). This is a familiar measure in the literature on 
costs of children, and several empirical estimates are available. The 
NCC is to be multiplied by twice the average age of childbearing in 
order to convert changes in the number of children per household to 
changes in the population growth rate. 5 (3) The implications of fertil¬ 
ity change for individual utility can be derived only for household 
heads (which is to say adults) and therefore do not cover the period of 
dependency in the parental home, unless one is prepared to accept 
the strong assumption that parents act fully as agents for their chil¬ 
dren. (4) In the neighborhood of the fertility level chosen by adults, 
the net cost of an incremental child, as just defined, should exactly 
balance the utility that the parents derive directly from enjoyment of 
the incremental child, assuming a conventional, nonaltruistic utility 
function. Then, for small changes in aggregate fertility, induced by 
policy, this equality should continue to hold, and so the within- 
household costs of an induced fertility change should be ignored 
in studying its welfare implications. Only the between-household ef¬ 
fects, which are externalities to childbearing, are relevant by this ar¬ 
gument. In practice, there is interest in the magnitudes of both kinds 
of effects. 6 

In the household model, we assume that the household heads seek 
to maximize their utility over their adult years, say from age 20 on. 
Let N(x, m) be the expected number of surviving members aged m in 
households with head aged x. Let C(x) and H(x) be total household 
consumption of market goods and home time services, respectively, in 
households with head aged x. The instantaneous utility of the heads 
from C(x) is given by i>[C(x), N(x ,»»)), where v c > 0 and v N <0 (for all 
m). Their instantaneous utility from H(x) is q[H(x), N(x, in)), where 
qn > 0 and q\ <0 (for all m). A commonly used special case of these 
makes utility depend on consumption per head or per equivalent 
adult consumer; this case does not allow for returns to scale, however. 

5 The factor of twice the average age of childbearing is a linear approximation to the 
relationship between changes in the number of children per household and the popula¬ 
tion growth rate (see Lee 1980 ). 

6 In the household model, the sum of the within- and between-household transfers is 
identically equal to the transfer measured in the individual model, abstracting from the 
effects of economies of scale in the household. The between-household component will 
most likely be dominated by transfers to the aged and yield a benefit from higher 
fertility, while the within-household transfers consist of the costs and benefits of chil¬ 
dren and will be negative (in developed countries). 
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The adult life cycle utility of the household heads is given by 

U - U{v[C(x), N(x ,«)], q[H(x), N(x, m)), l(x, (18) 

where /(*, m ) is the leisure of a member aged m when the head is aged 
x, and / is life cycle fertility. Note that this is a nonaltruistic utility 
function, similar in form to that used by Eckstein and Wolpin (1985). 
The household life cycle budget constraint is analogous to (5). 

Ignore for the moment the appearance of life cycle fertility in the 
utility function. In this case, after considerable effort, it is possible to 
derive the following simple and plausible result (see Lee 1985, p. 18): 


dU/dn 

kR 


Atm ~ Ag — 2Af‘ 


NCC 
R ' 


(19) 


where Atm and A E refer to average ages for heads of households at 
household consumption and production, construed to include both 
market and nonmarket items and time as well as goods; their differ¬ 
ence represents the change in between-household intergenerational 
transfers. The term Af refers to the average age of childbearing 
(which is about 26 years in the United States). The formal derivation 
of this result is in Lee (1985), and we do not repeat it here, but the 
interpretation is dear. When the population growth rate changes, the 
distribution of households by age of head changes, which alters 
the social and individual life cycle budget constraints (as in the indi¬ 
vidual framework) through both the financial sector (interest rates) 
and the public sector (age specificity of transfers). But in addition, a 
change in the growth rate alters the number of children per house¬ 
hold (by df/dn = 2 • A/). This reduces the consumption by adults in 
the household by NCC, the present value of the expected cost of the 
incremental (third) child, measured net of the child’s contributions to 
household services and earnings. This latter term represents the 
change in within-household transfers due to the change in fertility 
corresponding to the change in the population growth rate. It is 
shown in Lee (1985) that the cost of children at each age must be 
measured as the amount of additional household income and time 
necessary to leave the household heads as well off as in the absence of 
the incremental child. In the analysis, this is given by the integral of 
discounted terms of the form [dv/dN(x, x - k)\l[dvldC(x)] and 
[dqldN(x, x - fc)]/[fy/d//(x)]. This is greater than the amount by which 
the heads’ own consumption declines since an incremental dollar or 
hour (plus or minus) will not be wholly allocated to the heads but 
rather shared among all household members (see Lazear and Michael 
1983). 

Now consider a couple seeking to maximize U over fertility, con¬ 
sumption, and leisure, subject to a household budget constraint simi- 
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lar to the individual budget constraint of equation (5). The individual 
household must take the population growth rate and age distribution 
as given, independent of their own fertility decision. They will there¬ 
fore choose a number of children such that the marginal udlity from 
the last equals the net cost of an incremental child (Uf = NCC), which 
is equivalent to the within-household transfer (as shown in Lee 
[1985]). 

From equation (18), the full derivative with respect to n will include 
the right-hand side of (19), times A. * R, and also a term equal to Uf • 
(< df/dn ), which is 2 • Af. Substituting NCC for Uf and simplifying yields 

= A • R • ( Atm ~ Ae) 

in the neighborhood of equilibrium. The individually optimal fertility 
rate is not socially optimal since couples do not take into account the 
effect of their fertility on the aggregate growth rate and age distribu¬ 
tion; ( Atm ~ An)R is an externality to childbearing. This is an explicit 
expression for the externality noted by Eckstein and Wolpin (1985) 7 
in the context of a model that included capital. Had we assumed in 
equation (18) an altruisuc form for the utility function, the conclusion 
would not follow (see Pazner and Razin 1980). In Eckstein and Wol¬ 
pin, this externality occurs only when the economy is golden-rule 
Samuelson, one of the two possible stable outcomes. 

VIII. Empirical Estimation: 

The Between-Household Transfers 

Placing the analysis in a household setting evidently affects the empir¬ 
ical calculations that are necessary. The average ages of consumption 
and efficiency must be calculated on a household basis rather than on 
an individual basis, and this requires that individuals in the stable 
population be allocated to households. In what follows, we look first at 
the between-household effects and then make a rough attempt to 
estimate the magnitude of the within-household intergenerational 
transfers. 

Here we attempt to quantify how a change in population growth in 
the contemporary United States would affect household wealth and 
generalized consumption. To do so, we estimate the household pro¬ 
files by age of head for leisure, consumption of market goods, con- 

7 Eckstein and Wolpin (1985, pp. 101,102) interpret the externality as arising from 
the effect of fertility on the market rate of interest. But the effect of an increase in the 
market rate of interest (extended by analogy to include governmental transfers) is 
proportional to A TM - A E . When average ages of consumption and production are 
equal, the rate of interest (implicit or explicit) makes no difference over the life cycle. 
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sumption of home-produced goods, market resources for schooling, 
school time, and efficiency. The empirical estimates should be viewed 
as preliminary since often data were not available in the required 
form and interpolation was necessary, and at other times data came 
from surveys in which sample sizes were rather small. Most of the 
time-use data come from the 1975 time budget survey by the Institute 
for Social Research at the University of Michigan. The data on con¬ 
sumption of goods come from the 1972-73 Consumer Expenditure 
Survey (CES) (U.S. Bureau of Labor Statistics 1978) and various other 
government sources. Appendix A gives a complete description of the 
data and their sources. 

Figure 1 displays the average individual age profiles for most of the 
components of consumption. The leisure and school time schedules 
are simply the average time spent by members of the age group in 
these activities. The consumption of home-produced services, how¬ 
ever, is measured in efficiency units; it is the number of hours it 
would take an individual with an efficiency level of 1.0 to produce the 
amount of home time services consumed by the average individual at 
age x. We have focused on three sources of goods consumption: 
goods bought by the household, public outlays on education, and 
public oudays on health (although we refer to these as “goods,” they 
of course include many services). Goods bought by the household, 
which we refer to as “private goods,” are available only on a house¬ 
hold basis. Both education and health costs are available on an indi¬ 
vidual level and are shown in figure 1. 

For present purposes, we must aggregate individuals and their as¬ 
sociated age profiles into households. We do this here by assuming 
some stylized rules, as follows: individuals leave their parents’ house¬ 
hold at age 20, at which time they marry and form their own house¬ 
hold; they have their first birth at age 25 and their second birth 3 
years later; and they die in proportions consistent with the contempo¬ 
rary U.S. life table. Females and males are assumed to have identical 
life cycles and to experience the same mortality risks. 8 

Recall from equation (4) that the individual time budgets were con¬ 
verted to common units of market goods to express the budget con¬ 
straint. This means that the leisure hours at each age must be multi¬ 
plied by the efficiency level at that age, which is also the case with 
school time; thus they are measured by their opportunity cost in 

8 We also calculated our empirical estimates under the alternative life cycle trajectory 
that individuals formed households at age 25 and bore children at ages 30 and 33. 
Under these rather extreme assumptions, the average ages of consumption and pro¬ 
duction both increased but by similar amounts. The difference between them changed 
only slightly, from 1.0 year to 1.1 year, suggesting that our results are relatively robust 
to assumptions about the life cycle trajectory. 
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Fig. 1.—Average individual age profiles of consumption. These age profiles do not 
take mortality into account; i.e., consumption at age x is the average value for those 
alive at age x. See App. A for sources. 

terms of market goods. With these adjustments to the household 
profiles, figure 2 shows the household age schedules of the values of 
leisure, consumption of home production, school time, public goods, 
and market goods. Market consumption is originally observed on a 
household basis, so no aggregation is necessary. 

The household's life cycle begins with two adults at age 20. Botf 
adults are subject to mortality risks. Thus the average number o: 
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Fig. 2.—Average household profiles of consumption, by age of head (see notes to 
fig. 1). 


adults in households declines with age. The effect of declining house¬ 
hold size is particularly apparent in the household schedule of public 
expenditures on health. At the older ages, these costs rise on an 
individual basis, but the household expenditures remain fairly con¬ 
stant because of a declining number of adults per household. 

Figure 3 shows the individual and household profiles of efficiency. 
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Fic. 3.—Age profiles of efficiency for individuals (a) and households ( b) (see notes to 
fig. 1). 
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For those aged 14-64, age-specific efficiency levels are based on male 
age-specific wage rates, which were obtained from Denison (1979). 9 
For those under age 14 and over age 65, efficiency levels were as¬ 
signed on the basis of our subjective judgment. 10 

By assumption, nonlabor income has not been included in the anal¬ 
ysis, so total full wealth is just the summation over all ages of the wage 
multiplied by the efficiency level. Household efficiency peaks mark¬ 
edly around age 40: both adults are at their highest efficiency levels, 
and the household also contains two teenage children who are as¬ 
sumed to be productive, albeit at low levels. 

From the age profiles we can calculate the average ages (of house¬ 
hold heads) for the various forms of consumption and, combining 
these with their shares in total consumption, derive the overall aver¬ 
age age of consumption. Table 1 displays these average ages along 
with their respective shares. As expected, the average age of school¬ 
ing (37) is much younger than that for the other forms of consump¬ 
tion, while the consumption of publicly funded health services occurs 
much later than other forms of consumption (62). (Recall that “the 
average age of schooling" refers to the age of the heads of the house¬ 
holds whose members are engaged in schooling.) As it turns out, the 
average age of market consumption is not very different from the 
average age of time consumption. 

Consumption of market goods makes up only 17.2 percent of total 
wealth, with consumption of home services contributing another 9 
percent. Expenditures on schooling, for time and market goods to¬ 
gether, account for only 2.1 percent of total wealth. Leisure, with 71 
percent of total consumption, is by far the dominant form of con¬ 
sumption. This partly reflects the fact that sleep and other personal 
care are included in the leisure category. 11 

9 We also calculated intergenerational transfers assuming different age profiles of 
efficiency by sex and for home production. As an extreme case, women of all ages were 
assigned the same efficiency level, set at approximately 60 percent of the average male 
efficiency rate. Furthermore, all home production was assumed to be performed by 
women. (According to time-use surveys, women perform about 80 percent of home 
production.) With these extreme changes, the efficiency profile became much flatter 
than in the base case, hut the average age of efficiency changed only slightly, from 48.9 
to 49.2. Moreover, the average age of consumption changed as well, increasing from 
49.9 to 50.2. Thus the difference in the average ages, which indicates the degree of 
intergenerational transfer, remained virtually unchanged. The difference in average 
ages when leisure time is excluded also showed no significant change under these 
alternative assumptions for the age schedule of efficiency. 

10 The profiles assume that efficiency declines linearly from .75 at age 65 to .5 at the 
oldest age. We also calculated the average ages of consumption and production assum¬ 
ing that the efficiency level of the elderly declined to zero at the oldest age. This 
reduced the average age of generalized consumption and generalized production, both 
by about 1 year. The difference between the two average ages, then, is relatively 
insensitive to assumptions concerning elderly efficiency levels. 

11 Including sleep and other personal care in the leisure category might appear 
problematic since such time could be viewed as minimal biological maintenance and 
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TABLE I 

Avi race Ages and Shares of Components of Consumption 


Shares of 

Total Consumption (%) 

Including Exduding 

Average Ages Leisure Leisure 



Consumption of Goods 


Public expenditures 




on health 

62.2 

1.3 

4.4 

Public expenditures 




on education 

37.1 

1.0 

3.5 

Household expenditures 

48.8 

17.2 

59.1 

Subtotal 

49.1 

19.4 

67.0 



Consumption of Time 


Home production 

51.6 

8.5 

29.3 

Education 

37.3 

1.1 

3.7 

Leisure 

50.1 

71.0 

.0 

Subtotal 


80.6 

33.0 


Average Ages for Goods and Time Combined 


Including Leisure 

Excluding Leisure 

Total consumption 

49.9 

49.4 

Total time 

50.0 

50.0 

Total goods 

49.1 

49.1 

Total production 

48.9 

46.0 


When the various forms of consumption are combined, the average 
age of “generalized consumption” equals 49.9. The average age of 
“generalized production” is 48.9. Thus the average age of consump¬ 
tion exceeds the average age of full production by 1.0 year. Consump¬ 
tion occurs at slightly older ages than production, indicating that the 
net direction of intergenerational transfers between households is 


excluded from the analysis altogether. But it turns out that the treatment of sleep 
makes essentially no difference to our results because of its offsetting effects. If per¬ 
sonal care were excluded from the leisure category, leisure’s share in total consumption 
would fall from 71 percent to 48 percent, and this would widen the difference between 
the average ages of generalized consumption and production. However, now the value 
of total life cycle wealth would fall by a corresponding amount so that the intergenera¬ 
tional transfer, which is the product of the two, would remain the same. The only effect 
would be to reduce the proportion of life cycle wealth transferred, not the actual dollar 
amount transferred. Thus, however we valued sleep, either at the actual wage rate or at 
a wage rate of zero, the basic results of the model would be unchanged. We retain 
personal care in the analysis, however, because it is dear that the presence of young 
children leads to a reduction in sleep and therefore that intergenerational transfers are 
to some extent made from this category. 
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upward, from younger ages to older ones. Consequently, if intergen- 
erational transfers between households were the only effect operat¬ 
ing, a stable populadon with higher fertility would have somewhat 
higher life cycle consumption; with a growth rate of .01 per year, life 
cycle consumpdon per household would be 1.0 percent higher than 
with a growth rate of zero. 

Figure 4 a overlays the “expected" age profiles of consumption and 
production. The age profiles are quite similar, though consumption 
occurs at discernibly older ages than production. These profiles are 
different from those presented in figures 1 and 2 in that mortality 
risks are explicitly taken into account. The average value of consump¬ 
tion (and production) at any age is a weighted sum of households that 
survive and those that do not. Thus figure 4 should be interpreted as 
the efficiency and consumption profiles that would be “expected” by 
heads aged 20 viewing the future life cycle of their household. 

The age profiles of consumption and production (and hence their 
average ages) are so similar because leisure, which cannot be trans¬ 
ferred, represents 71 percent of total household wealth. Alterna¬ 
tively, we can express the intergenerational transfers as a proportion 
of all wealth except leisure (i.e., market goods, home services, and 
education); it then rises from 1.0 to 3.S (= 1 /[ 1 — .7]). The profiles of 
consumption and production excluding leisure are shown in figure 
46, which shows more clearly that consumption occurs at older ages 
than production, reflecting the 3.3-year difference in their average 
ages. 


IX. Empirical Estimation: The Within-Household 
Transfers 

Note that the calculations above did not include the incremental time 
and goods cost of children borne by the household, to which we now 
turn. A more detailed discussion of these within-household transfers 
can be found in Lee (1985). 

The treatment of children’s consumption poses special measure¬ 
ment problems. When the growth rate changes because of fertility 
variation, the added or deleted children are incremental members of 
“existing" households, and it is therefore their marginal effect on 
time and commodity expenditures that interests us, as it would be for 
any age group that did not typically head the households in which it 
lived. Marginal effects will generally be smaller than average ones 
because of increasing returns to scale and the existence of public 
goods in the household. 

From an empirical point of view, it is fortunate that the estimation 
of marginal effects for children is theoretically appropriate. A num- 
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Fig. 4.—Expected age profiles of consumption and production, a , Includes lei: 
time, b, Excludes leisure time. These graphs explicitly take account of mortality.' 
values at age x are a weighted average of households that survive and those that do 
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of studies report results that can be used to form at least rough 
mates of the age profile for a marginal child’s consumpdon, al- 
ugh for dme demands we are on less firm ground. The effects of a 
•ginai child depend on the initial number of children, and here we 
; our calculations on an initial two-child family, corresponding 
lely to an aggregate growth rate of zero. 

"he net transfer of market goods to a third child is calculated in 
. (1985) on the basis of the studies by Mason (1975) and Lazear and 
:hael (1983). Lee found a net transfer of $17,000 (in 1972 dollars), 
:h includes earnings of coresident children amounting to $5,000. 
y net bequests and other transfers after the child has left home 
uld also be included here; lacking data, we have ignored these, 
lis omission affects our estimates of the private cost of a third child 
not our estimates of between-household transfers.) 

"he estimates of time costs of the third child are based on Turchi 
75) and Hill and Stafford (1977). Our calculations of the time costs 
provided in Appendix B. These estimates show a time cost of 
B0 hours over the couple’s lifetime, which evaluated at the 1972 
fe rate equals $13,230. 

The total net cost to the household of a third child living its entire 
in 1972 in an average household would be the sum of the home 
; and market goods cost, or about $30,000 (= 17,000 + 13,320) 
!972 dollars. This is undiscounted, appropriate to a population 
wth rate of zero. 

Ve can now calculate the within-household transfer effect, which, 
owing equation (19), is two times the average age of fertility ( Af ) 
es the net cost of the incremental child, divided by total wealth, 
is turns out to equal .50 (= 2 x 25 x [30,000/3,000,000]). Ex- 
ssed as the proportional cost of a change in the growth rate from 0 
01 per year, this equals .0050 (= .50 x .01), or 0.50 percent of 
time wealth. 

T fertility were raised by decree sufficiently to move a population 
m stationarity to annual growth of 1 percent, then there would be 
fe cycle consumption gain of 1.0 percent arising from between- 
isehold transfers and a loss of 0.5 percent through within- 
isehold transfers; the net change would be a gain of about 0.5 
cent of total life cycle wealth (roughly $15,000), a little more than 
percent of life cycle wealth excluding leisure. To a first approxima- 
i, the within-household costs of supporting the incremental chil- 
n would be offset by the satisfaction parents received from them, 
ile there would be no such offsetting decline in utility to cancel the 
as from increased between-household transfers arising from a 
inger age structure. These results are, however, modified when 
ital is taken into account, as we now discuss. 
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X. Wider Generality of Results 

These results were derived within the framework of a theoretical 
model that assumed that no durable goods exist, that production and 
consumption age profiles are stationary over rime, and that popula¬ 
tion age distributions are stable, while population grows at a constant 
rate. These assumptions may appear to be so rarefied as to deprive 
dte results of all practical interest. In fact, however, the results and 
measures hold more generally. They continue to hold, for example, if 
age profiles of consumption and earnings have a fixed shape but rise 
secularly at a constant rate. The discount rate then would equal that 
constant growth rate plus the population growth rate, and a little 
algebra reveals that the equations of the system then reduce to the 
form analyzed here. 

Furthermore, if we assume that the saving rate is permanently at 
the level that maximizes consumption, then the analysis of transfer 
effects is essentially unchanged in a neoclassical growth model with 
capital accumulation. For the optimal saving rate (in the golden-rule 
sense) equals the share of capital in output, while consumption equals 
labor income. Thus the situation is analogous to the consumption- 
loan economy, in which consumption is constrained to equal income 
from labor alone. The important difference when capital is intro¬ 
duced is that more rapid population growth now leads to lower op¬ 
timal capital per head and therefore to lower labor earnings and 
consumption. Through this capital dilution effect, a 1 percent in¬ 
crease in the population growth rate would reduce per capita con¬ 
sumption by about 2.3 percent. 13 This effect is additive with the trans¬ 
fer effect and of opposite sign, and it appears to be somewhat larger. 

While Arthur and McNicoll (1978) and Lee (1980) simply assumed 
the golden-rule condition, Eckstein and Wolpin (1985) show that a 
similar system, with direct (nonaltruistic) utility from children, will 
converge either to the Samuelson case, with golden-rule investment, 
or to the classical case, with the interest rate higher than the popula¬ 
tion growth rate. 14 In the golden-rule Samuelson outcome, there are 


'* Willis (1988) considers this capita] dilution effect to be a component of the in- 
tergenerational transfers; we do not since it is unrelated to the population age distribu¬ 
tion. 

13 The derivative of consumption with respect to the population growth rate across 
optimal golden-rule paths is the ratio of steady-state capital to consumption (see Arthur 
and McNicoll 1978; Lee 1980). From Jorgenson (1980) the ratio of capital to market 
goods consumption is about 3.5 for the United States in 1972. Home-produced goods 
are approximately one-half the value of market goods, and thus consumption of time 
and market goods equals 3/2 of market consumption, putting the capital consumption 
ratio at about 2.3. Similarly, the ratio of capital to full consumption, including leisure, 
equals .6 (3.5/[97.9%/l7.2%]). 

14 Their model has exogenous cost per child and no public sector. "Samuelson" and 
"classical" refer here to the credit balance in financial markets only. 
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positive externalities to childbearing, and life cycle utility is below its 
maximum attainable level. This is similar to the case we analyze in this 
paper. However, if the classical non-golden-rule outcome occurs, 
there are no externalities to childbearing, and the steady state is op¬ 
timal. With public-sector transfers, externalities and suboptimality 
would presumably occur in this case as well (Willis 1987). 

Unfortunately, it is far more difficult to relax the steady-state as¬ 
sumption that all variables are constant or changing at constant rates. 
Of course one could run simulations, but it would be hard to justify 
the assumption of a fixed shape of the age profiles when interest rates 
were varying. 19 


XI. Social Welfare Implications 

Under the assumption of a competitive stationary economy, each 
newborn will in present value consume exactly what it produces over 
the life cycle, will receive in government transfers exactly what it 
contributes, and will transfer to its kin exacdy what it itself receives 
from them. But what is true over the life cycle is not necessarily true 
in the cross section. Thus the population as a whole at any instant may 
be a net debtor or creditor in relation to the market, the government, 
or the as-yet unborn. Because of this potential aggregate indebted¬ 
ness, the population may benefit from higher or lower fertility. In this 
paper, we have estimated the aggregate credit position of marginal 
members of the population (incremental newborns) on a per capita 
basis and found it to be a net debt of about $15,000 (1972 dollars) 
evaluated at the golden-rule interest rate of 0 percent. 

Ignoring the utility derived directly from descendants, including 
own children, we find that an increase of 1 percent per year in the 
population growth rate, arising from an additional 0.5 children per 
woman, would raise life cycle consumption (excluding leisure) by 
about 3.3 percent through between-household intergenerational 
transfers. This is offset by a negative 1.7 percent (representing the 
cost of a third child) because of within-household transfers. The ef¬ 
fect of intergenerational transfers alone, therefore, would be to raise 

15 The age profiles of consumption and production used in the analysis are based 
primarily on data from the early 1970s. White the actual population was not stable then, 
it more closely resembled a stable population with a growth rate of 1 percent than one 
with zero population growth. Thus it might be more appropriate to analyze the effect 
of a small deviation from a growth rate of .01 than of zero. When we redo the analysis 
in this way, (Atm - A e ) declines from 1.0 to .7, with little change in the NCC. Also, we 
have drawn on the analogy between the social budget constraint and the individual life 
cycle budget constraint. However, this analogy holds only if the real rate of interest 
equals the population growth rate plus the expected rate of labor-augmenting techno¬ 
logical progress, as it should in steady-state growth. This condition was probably not 
met in 1972-73, which introduces some further distortion into our analysis, but we 
doubt that this makes much difference to our results. 
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life cycle consumption by about 1.7 percent, or about $15,000. How¬ 
ever, these intergenerational transfers are offset by the capital dilu¬ 
tion effect of 2.3 percent. Therefore a .01 increase in the population 
growth rate would reduce life cycle consumption by about 0.5 per¬ 
cent, or about $4,000 (1972 dollars). Transfer effects are not suf¬ 
ficient to offset capital dilution effects, in the neighborhood of the 
U.S. allocation of the early 1970s. 

What are the welfare implications of these observations? We cannot 
say without modeling the trade-off between our consumption and the 
satisfaction from children. There are two main possibilities here. One 
is represented in equation (18): utility is derived directly from the 
number of children. This is very close to the utility function in Eck¬ 
stein and Wolpin (1985), and they have shown that indeed exter¬ 
nalities to childbearing do arise in such a model and that the system 
will converge to the golden rule as one of two possible outcomes. 
While one cannot say that the outcome is Pareto inefficient, one can 
say that the competitive steady-state equilibrium will yield lower fertil¬ 
ity, lower interest rates, higher capital per worker, and lower life cycle 
utility than will a social planner’s choice of steady state because of the 
positive externalities to childbearing. 

In this model, the marginal utility of a third child will exactly offset 
the within-household transfer (NCC), and therefore only the be- 
tween-household (market plus government) transfer is an externality. 
This amounts to 3.3 percent of nonleisure life cycle wealth, or about 
$29,000. As for capital dilution, this is not an externality (or rather it 
is a pecuniary externality, and so without welfare implications). In this 
analysis, therefore, there would be nontrivial welfare gains to be real¬ 
ized from having higher fertility, although without a dynamic analysis 
we cannot conclude that it would be desirable to raise fertility in order 
to reach that preferable steady state. 

Suppose, on the other hand, that parents derive utility from the 
utility of their children, as described by the altruistic utility function. 
It has been shown that in this case, in the absence of governmental 
transfers, the system converges to the steady state, which maximizes 
life cycle utility, with an interest rate above the golden-rule level, and 
a zero aggregate credit balance when evaluated at this interest rate 
(Pazner and Razin 1980; Willis 1988). When there are governmental 
transfers that affect the costs of children, externalities may arise (Wil¬ 
lis 1987). But we are not in a position to evaluate these externalities 
under the assumption of an altruistic utility function. 

XII. Summary and Conclusion 

Population age distributions in the developed world will undergo 
secular aging if fertility continues at its current low levels. This article 
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considers whether the pattern of intergenerational flows of time and 
goods is such that an exogenously imposed increase in fertility, mak¬ 
ing the steady-state population slighdy younger, would raise or lower 
the life cycle well-being of individuals, at least during their adult 
years. There have been very few previous attempts to address this 
question systematically, considering both public and private transfers 
and childhood as well as old age. This study differs from others 
particularly in its effort to include the transfers of services produced 
with inputs of home time, such as the care that parents provide for 
their children. This study is also unusual in that it distinguishes ex¬ 
plicitly between transfers taking place within households and those 
taking place between households. With a nonaltruistic parental utility 
function, the between-household transfers are externalities to child¬ 
bearing, while the within-household transfers should, at the margin, 
be equated to the satisfaction the parents receive from an incremental 
child. 

Comparing populations with total fertility rates of two and three, 
we find that, with higher fertility, life cycle consumption would be 
raised by $29,000 through transfers between households, reduced 
$14,000 by transfers within households, and reduced a further 
$20,000 by capital dilution, for a net reduction of $4,000 in house¬ 
hold life cycle consumption, all in 1972 dollars. All these numbers are 
point estimates with a considerable range of uncertainty. 16 

How are these numbers to be interpreted? They can be used to 
address different questions, as summarized in the following table (k is 
the marginal utility of wealth): 


Effect of Raising Total Fertility Rate from Two to Three 



Con s u m ption-Loan 

With Capital 

Household life cycle consumption 

+ $15,000 

-$4,000 

Household life cycle utility: 



Traditional utility function 

+ $29,000k 

+ $ 10,000k 

Altruistic utility function 

} 



16 Lee (1985) calculates the between- and within-household transfers when only mar¬ 
ket goods and services are considered. He finds a between-household transfer of 4.4 
percent of total wealth and a within-household effect of 0.9 percent, for a combined 
transfer of 3.5 percent. Our results are different primarily because of the inclusion of 
leisure in life cycle consumption, as the recalculation in the text shows. Nonetheless, 
some differences remain even after the recalculation because of home time services. 
The between-household effects are smaller in this paper because in practice services 
produced with nonmarket time are rarely transferred between households. The within- 
household transfers are larger here because the time costs of an incremental child 
consume a far greater share of all home production time than the commodity costs of a 
child as a share of all market consumption. This is as we expect: children are believed to 
be time intensive. 
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Implications of higher fertility for the real-world case, with capital, 
are shown in the second column. First consider the effect of higher 
fertility on life cycle consumption. This is the case considered by 
Samuelson (1975,1976) and by Arthur and McNicoll (1978). Here we 
find that the costs of children (within-household transfers) and the 
capital dilution effect outweigh the substantial gains that do indeed 
accrue from the reduced costs of supporting the elderly in a younger 
population; on net, higher fertility would reduce life cycle consump¬ 
tion. Once we try to interpret this result in terms of life cycle utility, 
the outcome depends on the assumed form of the trade-off between 
utility from consumption and directly from children. With a conven¬ 
tional utility function, the costs of children are offset by direct satis¬ 
faction for small fertility variations, and so there appears to be a utility 
gain from higher fertility. With an altruistic utility function, however, 
this need not be the case, and no conclusions can be drawn. 


Appendix A 

Description of Data Inputs 

Our analysis requires data on consumption of market goods and services and 
consumption of time by households, by age of head. For most forms of 
consumption, age data were available on an individual basis. By assuming a 
given household structure at each age of head, we combined the individual 
age schedules to form the household age schedules of consumption. Simi¬ 
larly, for production, we assume an individual age schedule of efficiency, 
which is then used to aggregate into a household schedule. 

We assumed the following household structure. The household starts with 
two heads aged 20, who are subject to mortality (hence number of adults 
declines with age of household). Households have two children, one born 
when heads are aged 25, the other born when heads are aged 28. 

Tables A1 and A2 show the individual and household age schedules of 
consumption and production. Sources for these data are given below. 


Goods Consumption 

Personal household consumption by age of head was taken from the 1972-73 
Consumer Expenditure Survey (U.S. Bureau of Labor Statistics 1978, p. 507). 
This should include all market goods purchased by the household. Data were 
available in 10-year age groups, generally. Linear interpolation was used to 
calculate consumption at single years of age. Special runs of the CES public- 
use tape, thanks to Robert Michael, provided consumption for those aged 65- 
85 in 5-year age groups. Personal household consumption for those aged 85 
and over is assumed constant. Two additions were made to the CES data. 
First, we expect that the elderly in nursing homes would not be included in 
the CES, so we added per capita private expenditures on nursing home costs 
among the elderly (Fisher 1980, p. 81). Second, private expenditures on 
college are thought to be underreported in the CES. We added one year of 
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TABLE AI 

Individual Ace Schedules, in 5-Yeak Groupings 


Ace 

Consumttion of Market 
Goods and Services 
(1972 Dollars per Day) 

Consumption of Time 
(Hours per Day) 

Production 
(in Efficiency 
Units) 

Pubtidy 

Financed 

Health 

Expenditures 

Publicly 

Financed 

Education 

Expenditures 

School 

Time 

Leisure 

Time 

Consumption 
of Home 
Production 

0 

$0.15 

$0.00 

.00 

24.0 

1.8 

.00 


$0.15 

$2.44 

3.69 

20.3 

.6 

.00 

10 

$0.15 

$2.44 

4.41 

19.4 

.7 

.17 


$0.15 

$2.02 

3.44 

18.2 

.7 

.32 

20 

$0.38 

$0.54 

.74 

16.9 

.7 

.56 


$0.38 

$0.20 

.28 

15.6 

1.0 

.78 

30 

$0.38 

$0.10 

.14 

15.6 

1.2 

.90 


$0.38 

$0.00 

.00 

15.6 

1.3 

.97 

40 

$0.38 

$0.00 

.00 

15.6 

1.3 

.99 


$0.38 

$0.00 

.00 

16.8 

1.3 

.98 

50 

$0.38 

$0.00 

.00 

16.8 

1.2 

.96 


$0.38 

$0.00 

.00 

16.8 

1.2 

.92 

60 

$0.38 

$0.00 

.00 

16.8 

1.1 

.85 


$1.87 

$0.00 

.00 

20.0 

2.7 

.75 

70 

$2.12 

$0.00 

.00 

20.0 

2.5 

.70 


$2.37 

$ 0.00 

.00 

20.0 

2.4 

.67 

80 

$2.62 

$0.00 

.00 

20.0 

2.2 

.63 


$2.87 

$0.00 

.00 

20.0 

2.1 

.59 

90 

$3.12 

$0.00 

.00 

20.0 

2.0 

.55 


$3.37 

$0.00 

.00 

20.0 

1.8 

.52 

100 

$3.52 

$0.00 

.00 

20.0 

1.8 

.49 


per capita private expenditures on college for each child aged 19 in the 
household ( 1978 Statistical Abstract of the United States, pp. 136, 138). 

Public expenditures on health do not show up in the CES data since such 
health care is not purchased by the household. Fortunately, per capita health 
costs are available in three broad age categories (0-18, 19-64, and 65 + ) in 
Fisher (1980). Fisher also reports per capita days spent in hospitals by age, 
which increase linearly after age 65 at the rate of .0285. It is assumed that 
public expenditures on health rise identically. The resulting public costs for 
health were adjusted by the consumer price index to 1972 dollars to match 
the CES data (1981 Statistical Abstract, p. 458). 

Public expenditures on education were calculated from per capita public 
expenditures on education for elementary and secondary schooling ( 1981 
Statistical Abstract, pp. 135, 147, 153, 158) and allocated by age of student 
using age-specific public school enrollment rates (1981 Statistical Abstract, 
p. 139). The costs of private schools are, in principle, included as a compo¬ 
nent of household consumption by the CES. 

Other public expenditures, such as defense spending, were not included in 
the analysis because these are not age specific and hence will not be affected 
by a change in the population’s age distribution. 
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slowing population growth 
Time Consumption 

Leisure by age was estimated from time-use data collected in 1975 by the 
Institute for Social Research (ISR) of the University of Michigan and re¬ 
ported in M. Hill (1985, table A6.4C). This study distinguishes five broad 
categories of time use that appear appropriately viewed as “leisure”; these are 
active leisure, passive leisure, personal care, social entertaining, and partici¬ 
pating in organizations. The ISR data were available to us only for people 
aged 18 and over, and therefore leisure for children was estimated as a 
residual after school time. 

School time for children aged 5-17 was taken direcdy from Timmer, Ec- 
des, and O’Brien (1985, p. 468); data were collected by the 1975 ISR study. 
Data were available in S-year age groups. For those aged 18 and over, it was 
assumed that enrolled students spent 6 hours in school plus 2 hours of 
homework, 5 days per week. It was further assumed that each weekday, a half 
hour was spent traveling to and from school. Enrollment rates for the adults 
were obtained from the 1981 Statistical Abstract (pp. 139, 158). 

Consumption of time used in home production of goods and services in our 
empirical analysis is the consumption of “home production” by someone aged 
x, when the home production is supplied by an individual with an efficiency 
level of 1.0. The consumption of home time by children was taken from 
regressions in Hill (1985) at the ISR, which report the additional home pro¬ 
duction time spent by parents with the presence of children at various ages. 
The regressions also report the average time spent in home production when 
no children are present, and this we took as the average amount of home time 
consumed by ail adults. For those aged 65 and over, we assume that what they 
supplied in home services was what they consumed in home services, adjusted 
by their efficiency level (Hill 1985, table A6.4C). Likewise, for adults 18—64, 
time supplied in home production is adjusted for the efficiency level of the 
age group. 


Production 

Efficiency weights were calculated for males from Denison (1979), largely on 
the basis of age-specific earnings. It was assumed that male and female 
efficiency are equal at each age and thus that the actual male-female wage 
differences are due either to discrimination or to the specialization of females 
in nonmarket production. The wage rate (per efficiency unit hour) was cal¬ 
culated so as to satisfy the budget constraint and was about $3.50 per hour 
(which would apply to adults aged 35-44 in 1972-73). 

A problem arises in how to assign efficiency weights to the elderly since 
wage data would not be appropriate for age groups not typically in the labor 
market. We assume that efficiency declines from .75 at age 65 to .5 at age 100. 
We also made the calculations assuming that efficiency declined to zero at age 
100. Fortunately, this did not affect our basic results. Under this latter as¬ 
sumption, the average ages of both consumption and production of course 
declined, but they declined by almost the same amount (1 year). Since we are 
interested in differences between these average ages, the assumption about 
the elderly’s efficiency levels does not appear to be troublesome. 

We also had to make a rather arbitrary judgment on the efficiency of 
teenagers. Here we assumed that 10-year-olds are 7 percent as “efficient" as 
45-year-olds, rising to 40 percent as efficient by age 19. 
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Appendix B 

Within-Household Time Conti of the Third Child 

The estimates of time costs of a third child are based on Turchi (1975) and 
HtH and Stafford (1977). Turchi (p. 92) distinguishes time costs by parity and 
finds that a second or third child increases the mother’s housework time by 
about 4,500 hours over its first 18 years, while a first child increases house¬ 
work time by about twice this, or 9,000 hours. 

The Hill and Stafford paper allows a more comprehensive examination of 
the effect of children on household time allocation but unfortunately does 
not distinguish effects by parity of child. We impose Turchi’s finding that the 
second and third children have half the effect of first ones on Hill and Staf¬ 
ford’s average figures. It is reassuring that Hill and Stafford’s average figures 
for the incremental housework associated with an average child are virtually 
identical to those found by Turchi for an average child. 

From regressions reported in Hill and Stafford (tables 1 and 2), we can 
derive the effects of an average child on the allocation of the mother’s and 
father’s time to home services and market work, expressed in cumulative 
hours over the first 18 years of a child’s life: 



Change in 

Change in 


Housework 

Market Work 

Mother 

+ 5,860 

-4,550 

Father 

+ 640 

+ 1.410 


The implication is that the mother's lifetime leisure is reduced by 1,310 
hours and the father’s by 2,050 hours, for a combined leisure reduction of 
3,360 hours. Under the assumption that the second and third have half the 
effect of the first and that these families had three children on average, this 
number should be multiplied by .75 to get the figure for a third child, or 
2,520 hours. Evaluated at $3,50 per hour (assuming that the efficiency of the 
time supplied is dose to unity), this comes to $8,820 in 1972 units. 

But this is the value of the incremental time taken only from the leisure of 
the parents; presumably the care of the child is financed from other time uses 
as well: siblings receive less attention, the house is messier, dinner is less well 
prepared, and so on. We wish to know how much additional time the house¬ 
hold would require in order to be just as well off materially as before the third 
child, and this will be more than 2,520 hours. 

How much more? For market goods, Lee (1985) found that the compen¬ 
sating increase was about 50 percent greater than the expenditure on the 
third child. Applied here, that would imply a total time cost of 3,780 hours 
(* 1.5 x 2,520), or $13,230 (= 3.5 x 3,780). 
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The Institutional Framework 
and Economic Development 


Gerald W. Scully 

University of Texas at Dallas 


The compound growth rates of per capita output and Farrell-type 
efficiency measures for 115 market economies over the period 
1960-80 were compared with measures of political, civil, and eco¬ 
nomic liberty. It was found that the institutional framework has 
significant and large effects on the efficiency ^and growth rate of 
economies. Politically open societies, which subscribe to the rule of 
law, to private property, and to the market allocation of resources, 
grow at three times the rate and are two and one-half times as 
efficient as societies in which these freedoms are abridged. 


How much material progress has mankind made in modern times 
and how much has this progress been affected by the choice of the 
insututional framework designed to bring it about? The Western in¬ 
dustrial countries and many of the former colonies chose an institu¬ 
tional framework that permitted a wide latitude for individual initia¬ 
tive, choice, and responsibility. In general, these countries are 
capitalist and democratic and are committed to the rule of law. Rising 
nationalism and the independence movement after World War II 
gave many new nations the opportunity to choose an institutional 
framework by which they could progress. State control and economic 
planning, with a concomitant deemphasis, if not denigration, of indi¬ 
vidual initiative, were (and remain) fashionable ideas. Many nations 
selected this route for economic development. Certainly, sufficient 

I thank Tom Fomby, Kathy Hayes, Steve Guisinger, C. A. Knox Lovell, Larry Mcr- 
ville, Dale Osborne, Phil Porter, Dan Slottje, and Gordon Tuilock for helpful comments 
on an earlier draft. Abdullah Al-Obaidan and Chung Kwoon-jo provided competent 
research assistance. 
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time has passed to judge if these choices affected economic progress. I 
conclude in this paper that nations that have chosen to suppress eco¬ 
nomic, political, and civil liberties have gravely affected the standard 
of living of their citizens. 

I. Two Paradigms on Economic Progress 

There are two contrasting paradigms that point to the path by which 
men progress economically: individualism and statism. Both para¬ 
digms are old and are still debated. That state direction leads to 
material progress is the older vision. The wisdom of the state in creat¬ 
ing the conditions for material progress through intervention and 
regulation was first articulated systematically by the mercantilists. In 
modern times state intervention is justified by the “vicious circle of 
poverty” thesis. This thesis has been advanced by many writers but 
has been elaborated substantially by Myrdal (1957, 1968). The con¬ 
trasting vision is that material progress is greatest if individuals have 
the right to pursue their own affairs unmolested by the state. The 
conceptualization of society as a nexus of private arrangements that 
are mutually beneficial is contained in the contractarian political the¬ 
ory of David Hume, the concept of law as sanctioned private arrange¬ 
ments by Sir Edward Coke, and the economic theory and policy of 
laissez-faire by Adam Smith. Bauer and Yamey (1957) are modern 
thinkers who adopt Smith's premises on economic development. 

For the classical school, rights are a precondition for human prog¬ 
ress. Life, liberty, and property are not additively separable attri¬ 
butes; the diminution of one diminishes all. Security of rights affects 
their value (utility). Axiomatically, certain prospects have more value 
than uncertain ones. Security of rights leads to greater individual 
(national) wealth. Thus judge-made (common) law, with its adherence 
to precedent and the rule of stare decisis, is more certain than civil 
law. One legislature is not bound by the decisions of previous legisla¬ 
tures. Transaction costs are higher when law is uncertain (Scully 
1987). Security of property and certainty about the legal claim on the 
stream of income lead to higher rates of capital accumulation. Capital 
accumulation is a sine qua non of economic progress. 

The great concern of the twentieth century, after the reconstruc¬ 
tion of the war-ravaged economies and the restoration of the interna¬ 
tional economic order, has been the improvement in the standard of 
living of the underdeveloped nations. The fashionable view was that 
the gap between the rich and the poor nations was caused by a vicious 
circle of poverty that required draconian measures to break. The 
remedy for breaking the circle is state control and economic planning. 
In a nation’s commercial relations with the world, import restrictions, 
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export subsidization, and foreign exchange controls are the appropri¬ 
ate commercial policy. The theories of comparative advantage and 
free trade are viewed as fallacious. There are “sound reasons why it 
may choose to produce at home things which it could import more 
cheaply or to export things at a loss to be covered by subsidy” (Myrdal 
1957, p. 98). A decade later in Asian Drama , Myrdal (1968, pp. 67, 
115-16, passim) abandoned what could be interpreted as an educa¬ 
tive approach to social change for compulsion. Now he wanted a 
complete transformation of the values and attitudes that people hold 
and in the institutions that foster those values. 

Of course, Adam Smith ([1776] 1937) would have none of this. 
Self-interest promoted the general welfare. For his mercantilist pre¬ 
decessors, national wealth was the stock of precious metals at hand. 
There were policies to increase their sum. For Smith national wealth 
was the aggregation of individual wealth. His conception of national 
wealth was radically different from the mercantilist conception, and, 
hence, the appropriate institutional structure, property rights, and 
policies were quite different. To the mercantilist the legal, political, 
and economic institutions of mercantilism were justified in the public 
interest of increasing the national wealth (the stock of precious met¬ 
als). Smith condemned this institutional framework as anathema to 
private interests. Given his theorem of the “invisible hand,” mercantil¬ 
ist policies conflicted with the public interest as well. 


II. Data and Variable Construction 

The effects of the structure of property rights on the allocation of 
resources within firms now are well recognized. Economies can be 
thought of as big firms. Just as the efficiency of firms is affected by the 
structure of property rights, so is the efficiency of economies. Firms 
choose a particular organizational form, but the political, social, legal, 
and economic system within which firms make those choices is exoge¬ 
nous to the firm. Economies or nations determine the rights structure 
or the “rules of the game” in which individual economic actors make 
choices. This choice of the institutional framework of the economy 
has consequences for the allocation of resources (efficiency) in the 
economy. 

Let an economy be described by a simple neoclassical production 
function, homogeneous of degree one in the inputs. 1 In intensive 
form the production function is written asy = f(k), where y = YIL, k 

1 The assumption is testable. The production function was estimated for the entire 
sample of economies. An F-test on the unrestricted model vs. the restriction of unity for 
the sum of the coefficients yielded F(l, 112) * 1.39, which is insignificant. 
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* K/L, and Y, L, and K are national output, labor force, and capital 
stock, respectively. Differentiating the production function with re¬ 
spect to time and dividing by y yields g, » «* • g*, where g y is the growth 
rate of output per head, e* is the elasticity of output per head with 
respect to the capital-labor ratio, and g* is the growth rate of the 
capital-labor ratio. The neoclassical growth function is the basis for 
the statistical growth model to be presented below. 

The effect of an increase in the growth rate of the capital-labor 
ratio on the growth rate of real per capita gross domestic product 
depends on how efficiently resources are employed in the economy. 
For equal rates of capital formation, economies that transform inputs 
into output relatively inefficiently will grow more slowly than efficient 
economies. One or more of the economies described by the neoclassi¬ 
cal production function above will have values of output per head that 
are greater than those of other economies with similar values of the 
input ratio. These economies are the most technically efficient in 
transforming inputs into output. Such economies are said to be fron¬ 
tier efficient. It is hypothesized that efficiency differences between 
economies are the result of differences in the efficiency properties of 
the rights structure or the institutional framework chosen. Designate 
the efficient economies asy*. the efficiency frontier. Economies can be 
compared to the efficiency f rontier, and a measure of efficiency, EFF, 
is defined as EFF = yly*, with 0 < EFF *£ 1. 

Technical efficiency measures were estimated in deterministic and 
stochastic specifications with little difference to the conclusion. In the 
Aigner and Chu (1968) deterministic frontier, the one-sided error 
term requires y *s f(k). Under the assumption that all measurement 
errors are negligible, the error term strictly captures technical 
efficiency differences and is computed from the vector of residuals. 2 

The cross-country economic and institutional data employed in this 
study come mainly from two sources. Summers and Heston (1984) 
have constructed internationally comparable economic series for a 
large number of countries over the period 1960-80. Data on the 
institutional characteristics of countries come from Gastil (1982), who 
has annually published (since 1973) country rankings of political lib¬ 
erty and civil liberty, type of economic system, and other measures of 
freedom. The institutional variables employed in this study are aver¬ 
ages of the Gastil rankings for the period 1973-80. 9 

1 See Porter and Scully (1987) for a fuller discussion of the technique of measuring 
efficiency difference across rights regimes. 

’Use of institutional data over the period 1973-80 as a predictor of economic 
growth and efficiency over 1960-80 is suspect. A variety of tests were conducted to 
verify that the use of these data was not problematical. Briefly, Gastil's rankings for 
1973 were significant predictors of the 1984 rankings and by inference should measure 
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Data on real gross domestic product per capita, population, and the 
percentage of real gross domestic product devoted to gross domestic 
investment were available annually for 115 market economies for the 
period. From these data the compound growth rate of real per capita 
gross domestic product and the compound growth rate of the real 
capital-labor (population) ratio over the period were calculated. 

The use of population in the calculation of these variables as a 
proxy for the labor force is disagreeable but conventional. The con¬ 
struction of the series on the annual capital stock by country over the 
period is based on the methodology suggested by Harberger (1978). 
The Summers and Heston annual data series on real gross per capita 
domestic product, population, and the percentage of real domestic 
product devoted to gross domestic investment provide the basic data 
for the construction of the capital stock series. Harberger assumes a 
depreciation rate of 2.5 percent per year for buildings and 8.0 per¬ 
cent per year for machinery and equipment. These are the deprecia¬ 
tion rates used in this study. To obtain country estimates for the 
capital stock in the initial year, gross investment for 1960 was multi¬ 
plied by the fraction of noninventory investment and divided by the 
weighted depreciation rate. 

The variables employed to capture some of the effects of the in¬ 
stitutional framework on economic development rank the level of 
political, civil, and economic liberty in nations of the world. Gastil has 
created two measures of liberty: political liberty and civil liberty. 4 
Political (civil) rights are ranked by him from 1 (the highest degree of 
liberty) to 7 (the lowest). The political rights rankings are based on the 
degree to which individuals in a state have control over those who 
govern. The civil rights rankings purport to measure the rights of the 
individual (e.g., independence of the judiciary or freedom of the 
press) relative to the state. 5 Gastil measures economic liberty in two 
ways. He categorizes economic systems as capitalist, mixed-capitalist, 
capitalist-statist, mixed-socialist, or socialist. He also describes the 
level of economic liberty in nations. Economic freedom is designated 
by him as high, medium-high, medium, low-medium, and low. 

Econometric difficulties arise in employing ordinal rankings as con- 


rights back in the 1960s. The frequency of classification error (free vs. not free) over 
the 12-year period for the 115 countries was low (7 percent). Other data sources (for 
1960-65) were compared with Gastil's 1973 measures and were found to be correlated 
(r - .72 for N = 82). 

4 Gastil mainly used newspaper reports, journals, Amnesty International and other 
rights reports, the U.S. Department of State and its reports to Congress on the human 
rights of nations receiving American assistance, and other public sources for the data to 
construct these rankings. He compared his measures with other measures. 

s For a full discussion of Gastil's rights variables and cross-validation of the rights 
variables with 30 specific rights measures across 63 countries for which there were 
comparative data, see Scully (1987). 
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TABLE 1 


Regressions Relating the Separate Effects of Institutional Variables on 
Economic Growth (CAPGWTH) over the Period 1960-80 (N = 115) 


Equation 


CHGKL 


Independent 


Number 

Constant 

Coefficient 

Coefficient 

Variable 

(1) 

.0198 

.5065 



.3456 

(11.56) 

(7.82) 




(2) 

.0185 

.5090 

.0068 

POL OPEN 

.3571 

(9.82) 

(7.93) 

(1.74) 



(3) 

.0255 

.4992 

-.0114 

POL CLOSED 

.4151 

(11.60) 

(8.15) 

(-3.80) 



(4) 

.0181 

.5017 

.0094 

INDIV RIGHTS 

.3714 

(9.87) 

(7.90) 

(2.38) 



(5) 

.0243 

.5102 

-.0120 

STATE RIGHTS 

.4181 

(12.26) 

(8.36) 

(-3.88) 



(6) 

.0171 

.5050 

.0105 

FREEMKT 

.3893 

(9.08) 

(8.07) 

(3.01) 



(7) 

.0233 

.4765 

-.0123 

COMMAND 

.4058 

(12.24) 

(7.65) 

(-3.53) 




Note. —Variable definitions: CAPGWTH * ihc compound growth rate of rral per capita gross domestic product 
(1960-80); CHGKL « the compound growth rate in the capital-labor ratio (1960-80); POL OPEN - I if the Gasti) 
ranking of political liberty is less than 2.0, and 0 otherwise; POL CLOSED * l if the Gastil ranking of political liberty 
is equal to or greater than 5.0, and 0 otherwise; INDIV RIGHTS - 1 if the Gastil ranking of civil liberty is less than 
2.0, and 0 otherwise; STATE RIGHTS - 1 if the Gastil ranking of dvil liberty is equal to or greater than 5.0. and 0 
otherwise; FREEMKT ■= 1 if the ranking of economic liberty based on Gastil is less than 2.0. and 0 otherwise; 
COMMAND - 1 if the ranking of economic liberty is equal to or greater than 5.0, and 0 otherwise. 


tinuous variables. The most straightforward solution is the transfor¬ 
mation of the continuous variable into a set of dummy variables. This 
is the approach adopted here. For each measure of liberty, three 
dummy variables were constructed, which more or less correspond to 
Gastil’s categories of high, medium, and low levels of freedom. The 
variables are defined in table 1. 


III. Empirical Evidence: The Effect 
of the Institutional Framework 

Regressions relating each of the dummy variables, which describe a 
characteristic of the institutional framework, to the growth rale of per 
capita real gross domestic product appear in table 1.® All the indepen¬ 
dent variables are of the expected sign and are significant at the 1 
percent level (the coefficient POL OPEN is significant at the 5 percent 
level). The growth rates of real domestic product per capita by institu¬ 
tional attribute were calculated from these regressions. On average, 
politically open societies grew at a compound real per capita rate of 
2.53 percent per annum compared to a 1.41 percent growth rate for 

8 Given the wide range of sizes of economies in this study, the violation oi constant 
variance across countries ranked by size is a possibility. Several tests were conducted 
including the Goldfeld-Quandt test. The assumption of normality was validated. 
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politically dosed societies. On average, societies that subscribe to the 
rule of law. grew at a 2.75 percent rate compared to a 1.23 percent 
rate for societies in which state rights take precedence over individual 
rights. On average, societies that subscribe to private property rights 
and a market allocation of resources grew at a 2.76 percent rate 
compared to a 1.10 percent rate in nations in which private property 
rights are drcumscribed and the state intervenes in resource allo¬ 
cation. Thus the institutional framework is not only a statistically 
significant explanation of intercountry variation in the growth rate of 
real per capita gross domestic product but also a phenomenon of 
considerable magnitude. Growth rates in societies that circumscribe 
or proscribe political, civil, and economic liberty are only 40-56 per¬ 
cent (depending on the attribute) of those in societies in which indi¬ 
vidual rights are protected. 

Political, civil, and economic liberty are logically separable, al¬ 
though the degree to which these rights may be unbundled and re¬ 
main robust is highly questionable. Different nations may bundle 
rights differently, for example, offering more economic liberty and 
less political or civil liberty than another country. Undoubtedly, there 
are limits to this unbundling of individual rights; that is, economic 
liberty may be relatively meaningless in a tyranny. If these freedoms 
can be unbundled and nations engage in such practices, then each of 
the variables, which measure an attribute of the institutional frame¬ 
work, ought to emerge statistically significant in the regression that 
includes all the rights variables. If the types of liberties used in this 
study cannot be unbundled, then the matrix is singular. All the in¬ 
stitutional framework dummy variables are included in equation (1) 
in table 2. The increase in the standard errors of all the institutional 
framework dummy variables indicates the presence of some multicol- 
linearity. The statistical results are suggestive of the fact that different 
nations do bundle rights differently but that the separability of rights 
is relatively weak. 

The calculated compound growth rate of real domestic product per 
capita for the average nation that has an institutional framework with 
a high degree of political liberty, civil liberty, and economic liberty is 
2.73 percent per annum. The calculated growth rate for the average 
nation with an institutional framework in which political rights are 
proscribed, state rights take precedence over individual rights, pri¬ 
vate property is circumscribed, and the state intervenes in resource 
allocation is 0.91 percent per annum. Hence, the average growth rate 
in societies in which these freedoms are restricted is one-third of that 
of free societies. These combined restrictions on liberty constitute a 
67 percent tax on the wealth of the citizens of such states. 

Efficiency measures for each economy were calculated and re¬ 
gressed on the institutional variables. Where the rights variables are 
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TABLE 2 

Regressions Relating the Effects of All the Institutional Variables on 
Economic Growth, Economic Efficiency, and the Chance in Economic 
Efficiency (V *> 115) 


Independent 

Variable 


Dependent Variable 


CAPGWTH 

(1) 

EFF80 

(2) 

CHGEFF 

(3) 

Constant 

.0261 

.5124 

.1153 


(8.21) 

(13.83) 

(4.25) 

CHGKL 

.4713 

. . . 



(7.64) 



POL OPEN 

-.0170 

-.0200 

-.2196 


(-1.80) 

(-.18) 

(-2.67) 

POL CLOSED 

-.0065 

-.1228 

-.0628 


(-1.52) 

(-2.38) 

(-1.66) 

INDIV RIGHTS 

.0151 

.1620 

.1550 


(1.53) 

(1.38) 

(1.80) 

STATE RIGHTS 

-.0033 

-.0080 

.0071 


(-.71) 

(-.15) 

(17) 

FREEMKT 

.0031 

.1304 

.0109 


(.59) 

(2.06) 

(24) 

COMMAND 

- .0072 

- .0700 

-.0781 


(-1.67) 

(-1.38) 

(-2.11) 


.4404 

.4185 

.0957 


entered as single variables in simple regressions (not shown here), all 
the rights variables are of the expected sign and are statistically 
significant at well above the 1 percent level. The average economy 
that is politically open, in which individual rights take precedence 
over state rights, or in which private property and market allocation 
of resources prevail has an efficiency level of .738-.774, depending 
on the freedom measure. On the other hand, the average economy 
that is politically closed, in which state rights prevail, or in which 
private property and the market allocation of resources are cir¬ 
cumscribed has an efficiency rating of .338-.3S5. Thus societies in 
which freedom is restricted are less than half as efficient in converting 
resources into gross domestic product as free societies are. Alterna¬ 
tively, more than twice as much output could be produced with the 
same resource endowment in these societies if liberty prevailed. Com¬ 
bining all the rights variables into the efficiency equation (eq. [2] in 
table 2) changes the results only marginally (.785 vs. .312). Obviously, 
there is some multicollinearity present in the equation. 

Growth and efficiency are linked. For equal rates of capital forma¬ 
tion, economies that transform inputs into output relatively ineffi¬ 
ciently will grow more slowly than efficient economies. But the order 
of magnitude of this effect is unknown. The effect of the institutional 
framework on economic growth and on economic efficiency has been 




66o 


JOURNAL OF POLITICAL ECONOMY 


TABLE 3 


Repressions Relating the Separate Effects of Institutional Variables on the 
Chance in Economic Efficiency (CHGEFF) between 1960 and 1980 (N » 115) 


Equation 

Number 

Constant 

Coefficient 

Independent 

Variable 

n* 

U) 

.0587 

-.0121 

POL OPEN 

.0011 


(3.85) 

(-.35) 



(2) 

.0855 

-.0602 

POL CLOSED 

.0346 


(4.60) 

(-2.26) 



<») 

.0529 

.0176 

INDIV RIGHTS 

.0023 


(3.49) 

(.51) 



(4) 

.0779 

- .0578 

STATE RIGHTS 

.0288 


(4.6 J) 

(-2.09) 



(5) 

.0468 

.0363 

FREEMKT 

.0034 


(2.97) 

(1.18) 



(6) 

.0774 

-.0839 

COMMAND 

.0550 


(5.08) 

(-2.76) 




shown. Is there evidence that the efficiency of some economies has 
declined and that this decline is related to choice of the institutional 
framework? Is there evidence that any observed decline in economic 
efficiency has a significant impact on the rate of economic growth? 
These questions are addressed in this section. 

Comparative static changes in economic efficiency for each econ¬ 
omy in the sample were calculated by estimating the deterministic 
frontier production function for 1960, calculating the efficiency mea¬ 
sure for 1960, and constructing the variable CHGEFF = EFF80 
- EFF60. 

The computed changes in economic efficiency over the period were 
compared with the institutional framework variables. The results ap¬ 
pear in table 3. The £-values of POL OPEN, INDIV RIGHTS, and 
FREEMKT indicate a lack of statistical significance, while those of 
POL CLOSED, STATE RIGHTS, and COMMAND are of the ex¬ 
pected sign and are significant at above the 2.5 percent level. This is 
precisely the result that one would expect. All the efficiency gains 
from freedom have been captured in free societies, and one would 
not expect an improvement in efficiency over time. On the other 
hand, unless individual rights are proscribed in closed societies, un¬ 
certainty about freedom, random changes in the status of the individ¬ 
ual relative to the state, or a secular decline in liberty can cause 
efficiency to decline in societies in which liberty is constrained. 

IV. Summary and Conclusions 

Like water that seeks its own level, resources are predisposed to flow 
to their highest valued use. But a necessary condition for achieving 
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this efficiency result is that all resources be owned exclusively by pri¬ 
vate individuals and that these resources be transferable. The polit¬ 
ical, social, legal, and economic framework of society defines what 
resources can be owned, who can own them, and how they can be 
employed. The institutional framework sets the parameters of rights 
in a society. The menu of choices open to mankind is rich. What is 
sanctioned in law ranges from judge-made rulings, as in common law, 
to decisions based on the “word of God,” as in Muslim law. In politics, 
the range is from the Vermont town hall meeting, in which every¬ 
thing is settled by the citizens, to totalitarian states, where even the 
most private of matters between people must be sanctioned by the 
state. In economics, the range is from purely private economies, in 
which even the legitimacy of the public goods property of the light¬ 
house is disputed, to communist societies (certain religious orders), in 
which everything is owned in common. 

A rich literature exists on the efficiency effects of different eco¬ 
nomic, political, social, and legal arrangements. While considerable 
testing of property rights theory, in the relatively narrow context of 
the U.S. institutional framework, has been undertaken at the firm or 
industry level, no empirical work has been done at the economywide 
level to assess the effects of different economic, political, and legal 
arrangements on economic efficiency and economic growth. The ex¬ 
istence of comparable international economic data and measures of 
political and civil liberty makes possible a test of the effect of the 
institutional framework on the efficiency and growth of economies. 

In this study of the world’s 115 market economies over the period 
1960-80, compound growth rates of real domestic product per capita 
and a measure of economic efficiency were compared with measures 
of political, civil, and economic liberty. It was found that the choice 
of the institutional framework has profound consequences on the 
efficiency and growth of economies. Politically open societies, which 
bind themselves to the rule of law, to private property, and to the 
market allocation of resources, grow at three times (2.73 to 0.91 per¬ 
cent annually) the rate and are two and one-half times as efficient as 
societies in which these freedoms are circumscribed or proscribed. 

More research is required. Richer measures of the institutional 
framework need to be developed. More sophisticated models, which 
link these measures to the performance of economies, would shed 
more light on the topic. But if the order of magnitude of the effects of 
the institutional framework found in this study holds in subsequent 
research, as I believe it will, the issue of the configuration of the 
appropriate structure of property rights for economic development 
needs to be brought to the forefront in the development literature. 
The roles of capital accumulation, innovation, human capital, and 
entrepreneurship are widely recognized as sources of economic 
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growth. Still more fundamental, a precondition for accumulation and 
innovation, is the right to capitalize. 
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The Impact of Product Recalls on the Wealth 
of Sellers: A Reexamination 
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In a recent article in this Journal, Jarrell and Peltzman (1985) opened 
an important new area to event study research when they analyzed 
the effect of automobile and drug product recalls on the shareholders 
of firms within these industries. Earlier, Crafton, Hoffer, and Reilly 
(1981) and Reilly and Hoffer (1983) found that severe automobile 
recalls have a significant short-term impact on demand, but neither 
study investigated the effect of recalls on industry shareholders. 

In their work, Jarrell and Peltzman found that for manufacturers 
of automobiles and ethical drugs, capital markets penalized share¬ 
holders far more than the direct costs of the recall campaign. Further, 
they concluded that in both industries, shareholders of competitor 
firms to the firm with the recalled product(s) also suffered wealth 
losses. Interestingly, they concluded that over the 1975-81 period 
shareholders of General Motors bore greater wealth losses from the 
recall of a Ford or Chrysler product than they did from the recall of a 
GM product. 

This paper presents several modifications to the Jarrell-Peltzman 
study, as it applies to the automobile industry, for the purpose of 

We appreciate the assistance of Sam Peltzman and the suggestions of an anonymous 
referee. We are grateful as well to Gailen Hite and Pamela Peterson. 
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strengthening their approach in terms of its agreement with conven¬ 
tional event study methodology. These modifications are shown to 
have a pronounced effect on the conclusions to be drawn with respect 
to the automobile industry. In particular, after these modifications 
are made, little evidence remains in this study that share prices are 
significantly affected by automotive recalls. 

In the next section, we briefly review the approach taken by Jarrell 
and Peltzman and the results they obtained. In the following section 


we present our modifications of their approach to the automobile 
industry and then proceed with a revised analysis of the effect of 


automotive recalls on industry share prices. 


Reviewing the Jarrell-Peltzman Approach 

Jarrell and Peltzman measured the full cost of a recall to automobile 
industry shareholders by examining the net-of-market (or excess) 
stock returns in the period surrounding public announcement of the 
recall. Using the Scholes Excess Returns file at the University of 
Chicago’s Center for Research in Security Prices, they analyzed the 
cumulative excess returns (CERs) for the automobile manufacturers 
over a 2-week window, CER( - 5,5). They indicated that their analysis 
was limited to auto safety recalls of the National Highway Traffic 
Safety Administration (NHTSA) (p. 525) and further that their anal¬ 
ysis was confined to recalls of the Big Three domestic manufacturers 
involving vehicles in excess of 10,000, 20,000, and 50,000 units for 
Chrysler, Ford, and GM, respectively. No attempt was made to distin¬ 
guish between trivial recalls (such as incorrect placards) and recalls of 
greater severity (e.g., brake failure or axle separation). Wall Street 
Journal (WSJ) publication dates were considered to be the recall dates. 
Recalls that met the authors’ size criteria but were not reported in the 
WSJ were excluded from the sample. Over the 1967-81 period, they 
found that the WSJ reported 116 recalls that met their criteria; 63 of 
these recalls were reported in the 1975-81 period. 

On careful examination of the 63 recalls actually used by Jarrell 
and Peltzman over the 1975-81 period, we found that 21 did not 
meet the letter of their criteria. They included in their sample such 
nonrecall events as extended warranties on air conditioning equip¬ 
ment, Federal Trade Commission inquiries into premature engine 
wear, numerous truck recalls, and other non-NHTSA motor vehicle 
news events. Further, they omitted from their sample 16 NHTSA 
auto recalls reported in the WSJ that met their criteria. After these 
and other adjustments, we found that there are 66 observations that 
correctly meet the recall criteria established by Jarrell and Peltzman. 
In this paper, we limit our analysis to the 1975-81 period since the 
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TABLE 1 


Mian CERs to* Auto Recalls, 1975-81: Unadjusted Mean CER(-5, 5) 



Jarrell and Peltzman 
(» * 63) 

H OFFER ET AL. 

(* “ 66) 

Mean (%) 

Absolute 

/-Value 

Mean (%) 

Absolute 

/-Value 

Big Three 

- 2.48* 



2.41** 

General Motors 

-1.28 



.98 

Ford 

-3.02 



3.04* 

Chrysler 

-3.48 



.71 


* This mult states dial the typical firm lost an average of 2.48 percent of its share price at a result of a recall of its 
product- 

* Significant at the 1 percent level. 

** Significant at the 5 percent level. 


authors acknowledge that their findings for the entire 1967-81 pe¬ 
riod hinge on the strength of their 1975-81 results (pp. 529, 532). 

We summarize primary results on their original sample of 63 as 
well as our own on the revised sample of 66 for 1975-81 in tables 1 
ind 2. In table 1 we see that Jarrell and Peltzman found for the Big 
Three as a whole and for Chrysler and Ford individually that the 
CERs are significant and negative over the 11-day window surround- 

TABLE 2 


Mean CERs of Competitors by Company, 1975-81: Unadjusted Mean 

CER( —5, 5) 



Jarrell and Peltzman 
( n = 63) 

Hoffer et al. 

(n = 66) 


Mean (%) 

Absolute 

/-Value 

Mean(%) 

Absolute 

/-Value 

411 recalls of other companies' 

-2.40 

4.65* 

—1.12 

1.93*** 

Competitor company: 

General Motors* 

-2.92 

5.34* 

-1.89 

4.12* 

Ford 

-1.81 

2.24** 

-1.33 

1.76*** 

Chrysler 

2.41 

2.44** 

-.28 

.25 

Recall company: 

General Motors 1 

-2.61 

2.39** 

-1.60 

1.45 

Ford 

-2.77 

4.22* 

-.31 

.36 

Chrysler 

-1.25 

1.47 

-1.12 

1.53 


1 This result states that rhe typical competitor lost an average of 2 40 percent of its share price as a result of 
mother firm's recall. 

b Thu result states that GM's share price fell an average of 2.92 percent in response to a recall by either Chrysler 
>r Ford. 

1 This result states that when C»M recalled an automobile, on average the price per share of Chrysler and Ford 
nock declined 2.61 percent. 

* Significant at the 1 percent level. 

** Significant at the 3 percent level. 

•** Significant at the 10 percent level. 
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ing a recall by the firm. The results for our revised sample of 66 are 
generally consistent with their findings except that the significance 
levels are somewhat reduced and the Chrysler result in particular is 
no longer significant. 

In table 2, we see that for the 1975-81 period Jarrell and Peltzman 
found that the average loss to competitors from another firm’s recall 
virtually matched the recall company’s loss as reported in table 1. This 
conclusion holds for the revised sample of 66 as well, although again 
the significance levels are reduced for both figures. Regarding the six 
firm-specific results reported by Jarrell and Peltzman, the revised 
sample yields results that are generally consistent in sign (five out of 
six) but of typically lower significance levels. 

Revised Analysis 

We noted above that over the 1967-81 period Jarrell and Peltzman 
reported finding 116 recalls that met their criteria, with 63 of these 
occurring in the latter 1975-81 period. Of the total 116 recalls, they 
identified four as being overlapping events. However, their definition 
of a recall overlap differs from established event study methodology, 
and their results are dependent on their definition, as we shall show. 
Overlapping events in the authors* definition are situations in which 
the same manufacturer had two or more separate recalls with over¬ 
lapping event windows. However, they do not consider the effect of 
overlaps across manufacturers, even when two or more manufactur¬ 
ers had recalls on the same day. 

To see the bias that may be introduced by this definition, suppose 
that the WSJ reported a GM recall on day 0 and a Ford recall on day 
1. The Jarrell-Peltzman model would test, for example, whether the 
price of Ford stock is being influenced on days (- 5, 5) by the GM 
recall. If we acknowledge the potential price impact of the Ford recall 
on the price of Ford’s shares, however, we have a serious overlap 
problem. Particularly, when the GM recall announcement is used as 
our day 0, the Ford recall may well depress Ford’s share price during 
the ( — 4, 6)—day period, independent of any effect from the GM 
recall. Thus, for 10 of the 11 days of the GM event window, Ford 
CERs may be affected by its own recall in addition to the GM recall. 
JarreU and Peltzman would include this recall event window in their 
sample and thus in their analysis of the impact of a GM recall on the 
share price of Ford. They would thus be led to the incorrect conclu¬ 
sion that the decline in the Ford share price over these 10 days was 
attributable to the GM recall. It is an essential part of conventional 
event study methodology that event windows be “clean windows,” that 
is, windows having no overlapping days with any other event window 



COMMENT 


667 


TABLES 

Horn* rr al. Clean-Window Mean CERs foe Auto Recalls. 1975-81 



Unadjusted Mean 
CER(-5, 5) 

Adjusted Mean 

CER( - 5, 5) 

Mean (%) 

Absolute 

/-Value 

Mean (%) 

Absolute 

/-Value 

Big Three 

-.92 

1.25 

-.31 

.41 

General Motors 

.37 

.31 

.98 

.58 

Ford 

-2.59 

2.51* 

-1.95 

1.70** 

Chrysler 

-.42 

.90 

1.09 

.73 


* Significant at the 5 percent level. 

** Significant at the 10 percent level. 


(see, e.g., Eades, Hess, and Kim 1984; Grinblatt, Masulis, and Titman 
1984). 

In our analysis, we define a “clean-window recall” as a Big Three 
recall that has no CER( — 5, 5) overlap with any other recall event 
window meeting the Jarrell-Peltzman criteria. 1 With this definition, 
our revised sample of 66 from the 1975—81 period falls to 29. With 
these 29 recalls as our sample, we ran every test that Jarrell and 
Peltzman reported on automobiles over the 1975-81 period. 2 Tables 
3 and 4 show our results. Only one result remains significant at the 5 
percent level. Further, when adjustments are made for “other non- 

1 Jarrell and Peltzman used the WSJ story date as the official recall date. The official 
NHTSA recall date is the date on which the automotive manufacturer formally notifies 
the agency of the problem and of its intent to recall. This information is made available 
to the public through posting in the Technical Reference Library of the Department of 
Transportation. Widespread public announcement of the recall generally occurs sev¬ 
eral weeks later when the manufacturer issues a public press release. The delay is 
needed to permit dealers ample time to receive technical notices and necessary parts. 
According to the Detroit bureau chief of the WSJ, the Journal obtains its automotive 
recall information from manufacturer press releases. It does not review the Technical 
Reference Library files. To see if stock market participants were utilizing recall data 
publicly available in the Technical Reference Library, we analyzed the cumulative 
abnormal returns for the 11-day event interval around the NHTSA Technical Refer¬ 
ence Library recall release date for each of the 66 recalls in the corrected sample. We 
found no evidence that NHTSA’s release of recall information affects stock prices, 
either on the day of the recall or for the entire 11 -day interval (see HofTer, Pruitt, and 
Reilly 1987). 

s To address the issue of whether, by chance, our clean-window recall sample in¬ 
cluded a disproportionate number of trivial recalls, we categorized them by severity 
type. Using the same recall severity classifications developed by Crafton et al. (1981) 
and Reilly and HofTer (1983), we categorized each recall as trivial, intermediate, or 
severe. Of the 29 dean-window recalls, 62 percent were of the most severe type, 28 
percent were intermediate, and 10 percent were trivial. The comparable percentages 
for the Jarrell-Peltzman sample were 62, 26, and 12. Note, however, that all 29 recalls 
of our sample met their size criterion and were therefore "nontrivial" using their 
definition as well. 
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TABLE 4 

Hofttr rr al. Clkan-Window Mean CER( - 5,5) or Competitors by Company, 

1975-81 



Unadjusted Mean 

Adjusted Mean 


Mean (%) 

Absolute 

(-Value 

Mean (%) 

Absolute 

(-Value 

All recalls of other companies* 

.65 

.63 

.35 

.34 

Competitor company: 

General Motors 

-1.26 

1.56 

-1.19 

1.29 

Ford 

-1.33 

.87 

-.81 

.54 

Chrysler 

.21 

.12 

2.11 

1.30 

Recall company: 

General Motors 1 ^ 

-1.51 

.72 

.10 

.05 

Ford 

.03 

.03 

.53 

.50 

Chrysler 

.14 

.08 

.60 

.32 


* Thu rciuli uam that the typical competitor gained 0.35 percent of in ihare price ai a result of another firm's 
recall, after adjuiung for nonrecati news. 

b This result states that GM's share price fell an average of 1.19 percent in response to a recall by either Fond or 
Chrysler, after adjusting for nonrecall news. 

* This result states that when GM recalled an automobile, the price per share of Chrysler and ford stock increased 
an average of 0.10 percent, after adjusting for nonrccall news. 


recall news,” only one result remains significant at even the 10 percent 
level. 3 That result is the negative effect on Ford’s share price from a 
Ford recall. Indeed, no other result is significant at the 20 percent 
level. 

For instance, in table 1, Jarrell and Peltzman found that the price of 
a share of Big Three stock fell on average by 2.48 percent as a result 
of a recall. When we eliminated the overlaps, we found the decline to 
be only 0.92 percent (0.31 percent when adjusted for other non recall 
news; see table 3). In table 2, Jarrell and Peltzman found that the 
share price of GM stock, unadjusted for other news, fell on average by 
2.92 percent in response to a recall by either Chrysler or Ford. 
Eliminating the overlaps, we found the unadjusted decline to be 1.26 
percent, which is not significant at the 10 percent level (table 4). 

The somewhat counterintuitive finding of Jarrell and Peltzman that 
GM shareholders suffered significantly greater wealth losses from 
Ford and Chrysler recalls than they did from recalls of their own firm 
is eliminated under the revised analysis. Specifically, neither the effect 

*The 1975-81 period can be characterized as exceptionally tumultuous for the 
domestic automobile industry. The tripling of oil prices, the loss of domestic market 
share, and the question of Chrysler’* survival were just three nonrecail factors that 
characterized this period. We call such factors “other nonrecall news." Jarrell and 
Peltzman make a compelling case for adjusting the CERs for nonrecati news (p. 528), 
noting both the industry and firm-specific turmoil of this period. While they made this 
adjustment for the own-firm impact, they omitted this adjustment for the competitor- 
firm impact. 
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of a GM recall on the GM share price nor the effect of a Ford or 
Chrysler recall on the GM share price is significantly different from 
zero using the dean-window sample, with or without adjustment for 
other news. 

Finally, a telling statistic, illustrative of the lack of support for the 
Jarrell-Peltzman hypothesis, is the simple fraction of CERs in the 
clean-window sample that are less than zero. This fraction is less than 
0.45. Not only is this result not significantly different from 0.5 at the 
10 percent level, it is on the wrong side of 0.5 in terms of Jarrell and 
Peltzman’s claims. 4 


Concluding Remarks 

Jarrell and Peltzman’s paper on the impact of product recalls on a 
firm’s shareholders represents the opening of an important new area 
to event study research. In this paper, we show that several revisions 
of their approach to the automobile industry, revisions intended to 
strengthen their approach in terms of agreement with conventional 
event study methodology, together with some corrections to their 
data set, have a substantive impact on the results achieved. Following 
the revisions, little significant evidence remains indicating that securi¬ 
ties markets penalize shareholders for an automotive recall by driving 
down share prices. For the most part, neither shareholders of the firm 
recalling the automobile nor shareholders of competitor firms are 
significandy affected. 


References 

Grafton, S. M.; Hoffer, George E.; and Reilly, Robert J. "Testing the Impact 
of Recalls on the Demand for Automobiles.” Econ. Inquiry 19 (October 
1981): 694-703. 

Eades, Kenneth M.; Hess, Patrick J.; and Kim, E. Han. “On Interpreting 
Security Returns during the Ex-Dividend Period.” J. Financial Earn. 13 
(March 1984): 3-34. 

Grinblatt, Mark S.; Masulis, Ronald W.; and Titman, Sheridan. “The Valua¬ 
tion Effects of Stock Splits and Stock Dividends.” J. Financial Econ. 13 
(December 1984): 461-90. 

4 In our earlier studies we found that recalls had a short-term impact on demand (in 
the month of the recall only), and then only when we limited our sample to the most 
severe type of recall. When we included lower severity recalls in our data set, we found 
no significant impact on auto demand. The dean-window sample of the current study 
includes all three types of recalls, has size criteria different from those of our demand 
studies, and covers a data period different from that covered in our earlier work. There 
is, as well, no compelling reason to believe that a transitory impact on demand need be 
important enough to produce significant movement in share prices. We therefore 
consider the current share price results to be consistent with our earlier findings on 
demand. 



JOURNAL OF POLITICAL ECONOMY 


6?0 

Hoffer, George E.; Pruitt, Stephen W.; and Reilly, Robert J. “Automotive 
Recalls and Informational Efficiency." Financial Rev. 22 (November 1987): 
433-42. 

Jarrell, Gregg, and Peltzman, Sam. “The Impact of Product Recalls on the 
Wealth of Sellers/’/PJE. 93 (June 1985): 512-36. 

Reilly, Robert J., and Hoffer, George E. "Will Retarding the Information 
Flow on Automobile Recalls Affect Consumer Demand?” Eton. Inquiry 21 
(July 1983): 444-47. 



Book Review 


The Economics of Comparable Worth. By Marx Aldrich and Robert Buc hele . 
Cambridge, Mass.: Ballinger, 1986. Pp. xxiii+180. $29.95. 

“Comparable worth” is the proposition that there should be “equal pay for 
jobs of‘equal value,’ ” where a job’s "value" is determined by a job evaluation 
that assigns “evaluation points” to skill, effort, responsibility, and working 
conditions on the job. Comparable worth usually entails two corollaries: first, 
sex discrimination may be said to exist when pay for jobs of equal value is 
negatively associated with the proportion of the jobs' incumbents who are 
female; and, second, when such sex discrimination is found to exist, it should 
be remedied by requiring increases in pay for the predominantly female but 
“comparable” (and “undervalued") jobs. This book represents the first serious 
attempt to provide an economic rationale for comparable worth. In the end, 
the attempt does not succeed, but Aldrich and Buchele are worth reading all 
the same. Although they are forthright in expressing their views, their book is 
admirably undogmatic. It provides a clearly written and engaging introduc¬ 
tion to comparable worth that is considerably more intelligent than others 
(e.g., Aaron and Lougy 1986) that have appeared recently, and it provides 
new evidence on the likely economic effects of adopting comparable worth. 

After reviewing the antecedents of comparable worth and trends in the 
relative pay of women workers in chapter 1, Aldrich and Buchele then turn, 
in chapter 2, to recent developments on the comparable worth front (e.g., its 
growing prominence in the public sector) and describe how job evaluations 
are conducted. Chapter 3 reviews theories of discrimination and occupational 
segregation and the extent to which such segregation accounts for the female- 
male pay gap. Chapter 4 discusses alternative comparable worth concepts, 
including not only the conventional view adopted by most advocates but also 
the authors' own somewhat different formulation, and it presents evidence 
on how implementation of these alternative concepts would affect the pay 
gap. Chapters 5 and 6 consider the impact of these alternative forms of 
comparable worth on the distribution of earnings and employment. Chapter 
7 summarizes the authors’ main findings and conclusions. 

In many discussions, comparable worth is a negation of fundamental eco¬ 
nomic precepts such as market wage determination. To opponents, including 
many economists, that is its basic flaw; to proponents, many of whom seem to 
know very litde economics, that may be one of its major attractions. To Al¬ 
drich and Buchele, however, comparable worth is a straightforward applica¬ 
tion of the economic theory of compensating wage differentials (pp. 104—12). 
The factors usually cited in discussions of comparable worth—skill, effort, 
responsibility, and working conditions—correspond quite closely to the 
“principal circumstances” Adam Smith listed in his famous discussion in the 
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Wealth of Nations (bk. 1, chap. 10) “which ... make up for a small pecuniary 
gain in some employments, and counter-balance a great one in others”: 
“agreeableness or disagreeableness” of the work, the “difficulty and expence” 
of training required for it, the “trust which must be reposed” in those doing 
the work, and so on. 

In modern-day terms, Aldrich and Buchele argue. Smith’s analysis merely 
means that, in the absence of discrimination, the equilibrium marginal return 
to each job trait would be equal (and purely compensating) among different 
jobs. Hence, if the return to a job trait is in fact systematically related to sex— 
if, say, the return to education is higher in predominandy male than in 
predominantly female jobs—one may properly draw an inference of sex 
discrimination. 

The fundamental problem with this argument is in its basic premise: as the 
authors acknowledge (see esp. n. 1, p. 101, and n. 1, p. 129), even in the 
absence of discrimination, the return to a job trait will be equal in ail jobs only 
if it is independent of the amount of the trait in a given job. Yet there is no 
a priori reason, and certainly no strong empirical evidence, to warrant such a 
constant-returns assumption. 1 (Indeed, the authors’ own empirical analyses— 
to which I will come presently—suggest, at least on their face, that returns to 
job traits such as schooling are actually higher in predominantly female jobs 
than they are in predominantly male jobs!) 

Matters become still more complicated once one recognizes that workers 
have heterogeneous tastes and are concerned with many job traits rather than 
just one. To illustrate, imagine a set of jobs for which employers’ zero-profit 
curves (giving the relation between pay and education at zero profits) are 
concave. Even if the envelope of these curves is linear, must the marginal 
return to education be the same in all jobs, as Aldrich and Buchele appear to 
be arguing? Only if workers are indifferent to all other characteristics of these 
jobs, for that is the only case in which they would care about being on the 
envelope. Otherwise, if, say, some workers perceive job A as more desirable 
than job B even at lower wages, whereas other workers feel just the opposite, 
at least some individuals will choose points underneath the envelope. In that 
case, the observed marginal return to education (among male jobs, female 
jobs, or all jobs), or to any other job trait, will depend on the distribution of 
tastes for the different jobs ("supply”) as well as on the shape of employer 
zero-profit curves (“demand”). Once one recognizes the possibility of sex- 
related differences in job preferences, it seems clear that positive or negative 
correlations between the proportion female among jobs and either pay (as in 
analyses conventionally adopted by comparable worth advocates) or marginal 
returns to job traits (as in the Aldrich and Buchele approach) do not necessar¬ 
ily provide any useful information about discrimination, especially employer 
discrimination. 

Finally, Aldrich and Buchele argue that, when returns to job traits are not 
equalized and are correlated with sex, “a wage adjustment that equalizes 
returns would improve—not undermine—the workings of the market (wages 
would then better approximate the wage structure generated by a competi¬ 
tive, nondiscriminatory market)” (p. 111). But even if returns (i) would be 

' As an example, due to Ronald G. Ehrenberg, consider the jobs of packer and 
computer programmer. Even in the absence of discrimination, there is no obvious 
reason why the marginal return to either physical strength or mathematical skill should 
be the same in these two jobs. 
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equalized in a competitive, nondiscriminatory market and (ii) are now un¬ 
equal, it is not obvious that simply decreeing first-best pricing without regard 
to alternatives or consequences (e.g., for employment), as Aldrich and 
Buchele apparently advocate, is necessarily appropriate in a second-best 
world. 

Whatever one thinks of the conceptual case for comparable worth—as 
formulated by most proponents or as reformulated by Aldrich and 
Buchele—the likely magnitude and effects of comparable worth-style wage 
adjustments are of considerable interest. On this score, Aldrich and Buchele 
not only review prior empirical studies but present an extensive set of original 
analyses, based primarily on National Longitudinal Survey data for young 
men (aged 28-S8) and women (aged 26-36) as of 1980. They contend that, 
according to their results, sex differences in occupational dutributions ac¬ 
count for rather little of the overall pay gap. Since comparable worth is, in the 
end, a means of narrowing that portion of the pay gap that is associated with 
occupational differences (e.g., raising the pay of nurses to that of tree trim¬ 
mers or zookeepers), such results naturally imply that comparable worth 
would have a rather small impact on the overall pay gap regardless of which 
version of comparable worth—the authors' or that of conventional advo¬ 
cates—is implemented. Calculations of “impact" effects (i.e., ones that ignore 
induced changes in employment levels) suggest that comparable worth wage 
adjustments would also equalize earnings distributions, not only for all work¬ 
ers but also among women, and would entail slightly greater gains for black 
women than for white women. 

Finally, effects on employment would be small: the authors estimate that 
implementation of their version of comparable worth might entail an increase 
in pay of about 14 percent for women (and an increase of about 11.5 percent 
in the pay of women relative to that of men) and that this would reduce female 
employment by about 3.5 percent. As the authors concede, however, this has 
to be taken with more than one grain of salt, for all the estimated own-wage 
effects in the labor demand system underlying these results are positive. 

At least in general terms, these results are consistent with those of several 
prior researchers (e.g., Johnson and Solon [1986], who suggest that compara¬ 
ble worth is unlikely to have dramatic effects on the pay gap, and Ehrenberg 
and Smith [1987], who find that its effects on employment are likely to be 
rather small). However, some caveats are worth noting. The authors' pay 
regressions specify pay in terms of levels rather than logs; omit a “total expe¬ 
rience" variable in favor of “prior experience” and "years with current em¬ 
ployer” variables, without quadratic or interaction terms; and focus exclu¬ 
sively on a relatively young segment of the work force. Perhaps the most 
problematic aspect of the analyses is the use of occupational means for all 
regressors. Thus, rather than use individuals as the unit of analysis, Aldrich 
and Buchele estimate grouped-data regressions, that is, use data on individ¬ 
uals grouped by occupation. Since the variable used to define the groups 
(occupation) is quite unlikely to be exogenous to the dependent variable 
(pay), conventional grouped-data methods do not seem very plausible here. 
(The use of grouped rather than individual data may help explain why, 
according to Aldrich and Buchele's estimates, the sex difference in occupa¬ 
tional distributions accounts for only about 15 percent of the male-female pay 

gap.) 

However, there is nothing particularly novel about this problem, for it 
arises in one form or another in virtually all work—whether it uses grouped 
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or individual data—that is concerned with the relation between jobs (or job 
traits) and earnings, Most conventional theoretical models of earnings have 
very little to say about the relation between jobs (or job traits), wages, and 
optimizing behavior; in particular, they offer the empirical analyst little guid¬ 
ance on how to address, for example, self-selectkm into jobs or the notion that 
job traits are a choice variable. The consistency of least-squares estimates of 
coefficients on, say, “job traits” variables in studies of compensating wage 
differentials is doubtful; alternative techniques (e.g., instrumental variables 
or selection bias correction) are of course available, but in the relatively rare 
instances in which they are used, they are usually applied in a somewhat 
mechanical fashion, with little if any basis in a coherent theoretical model. 
That Aldrich and Buchele do not provide one is unfortunate; but they are 
hardly the first researchers to fail to do so, and they are not likely to be the 
last. 


Rutgers University 


Mark R. Killingsworth 
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A Theory of Rational Addiction 


Gary S. Becker and Kevin M. Murphy 

University of Chicago 


We develop a theory of rational addiction in which rationality means 
a consistent plan to maximize utility over time. Strong addiction to a 
good requires a big effect of past consumption of the good on cur¬ 
rent consumption. Such powerful complementarities cause some 
steady states to be unstable. They are an important part of our 
analysis because even small deviations from the consumption at an 
unstable steady state can lead to large cumulative rises over time in 
addictive consumption or to rapid falls in consumption to absten¬ 
tion. Our theory also implies that “cold turkey" is used to end strong 
addictions, that addicts often go on binges, that addicts respond 
more to permanent than to temporary changes in prices of addictive 
goods, and that anxiety and tensions can precipitate an addiction. 


Use doth breed a habit. [William Shakespeare, The Two 
Gentlemen of Verona ] 


I. Introduction 

Rationai consumers maximize utility from stable preferences as they 
try to anticipate the future consequences of their choices. Addictions 
would seem to be the antithesis of rational behavior. Does an alcoholic 
or heroin user maximize or weigh the future? Surely his preferences 
shift rapidly over time as his mood changes? Yet, as the title of our 
paper indicates, we claim that addictions, even strong ones, are usu¬ 
ally rational in the sense of involving forward-looking maximization 
with stable preferences. Our claim is even stronger: a rational frame¬ 
work permits new insights into addictive behavior. 

People get addicted not only to alcohol, cocaine, and cigarettes but 
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also to work, eating, music, television, their standard of living, other 
people, religion, and many other activities. Therefore, much behavior 
would be excluded from the rational choice framework if addictions 
have to be explained in another way. Fortunately, a separate theory is 
not necessary since rational choice theory can explain a wide variety 
of addictive behavior. 

Sections II and III develop our model of rational addiction. They 
set out first-order conditions for utility maximization and consider 
dynamic aspects of addictive consumption. They derive conditions 
that determine whether steady-state consumption levels are unstable 
or stable. Unstable steady states are crucial to the understanding of 
rational addiction. 

Sections IV and V consider in detail the variables highlighted by the 
previous sections that determine whether a person becomes addicted 
to a particular good. These sections also derive the effects on the 
long-run demand for addictive goods of permanent changes in in¬ 
come and in the current and future cost of addictive goods. 

Section VI shows that consumption of addictive goods responds less 
to temporary changes in prices than to permanent changes. In addi¬ 
tion, the effects on future consumption of changes in current prices 
become weaker over time when steady-state consumption is stable, 
but they get stronger when the steady state is unstable. This section 
also shows how divorce, unemployment, and similar tension-raising 
events affect the demand for addictive goods. 

Section VII indicates why strong rational addictions must terminate 
abruptly, that is, must require going “cold turkey." Rational binges 
are also considered. 

Our analysis builds on the model of rational addiction introduced 
by Stigler and Becker (1977) and developed much further by Iannac- 
cone (1984, 1986). He also relates the analysis of addiction to the 
literature on habit persistence, especially to the work by Poliak (1970, 
1976), Ryder and Heal (1973), Boyer (1978, 1983), and Spinnewyn 
(1981). We appear to be the first to stress the importance for addic¬ 
tions of unstable steady-state consumption levels, to derive explicit 
long- and short-run demand functions for addictive goods, to show 
why addictions lead to abrupt withdrawals and binges, and to relate 
even temporary stressful events to permanent addictions. 

II. The Model 

Utility of an individual at any moment depends on the consumption 
of two goods, c andy. These goods are distinguished by assuming that 
current utility also depends on a measure of past consumption of c 
but not of y, as in 
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u(t) = u[y(t), e(t), S(01. (1) 

For most of the discussion we assume that u is a strongly concave 
function of y, c, and S. Past consumption of c affects current utility 
through a process of “learning by doing,” as summarized by the stock 
of “consumption capital” (S). Although more general formulations 
can be readily handled, a simple investment function is adopted for 
the present: 

S(t) - c(t) - 6S(t) - h[D(t)], (2) 

where S is the rate of change over time in S, c is gross investment in 
“learning," the instantaneous depreciation rate 8 measures the exoge¬ 
nous rate of disappearance of the physical and mental effects of past 
consumption of c, and D(t) represents expenditures on endogenous 
depreciation or appreciation. 

With a length of life equal to T and a constant rate of time prefer¬ 
ence, or, the utility function would be 

f/(0) = [ r e~ a, u[y(t), c(t), S(t)]dt. (3) 

Jo 

Utility is separable over time in y, c, and S but not in y and c alone 
because their marginal utilities depend on past values of r, as mea¬ 
sured by S. 

A rational person maximizes utility subject to a constraint on his 
expenditures. If A 0 is the initial value of assets, if the rate of interest 
(r) is constant over time, if earnings at time t are a concave function of 
the slock of consumption capital at /, u>(S), and if capital markets are 
perfect, then the budget equation would be 

r*~ rt [y(0 + pc{t)c(t) + pAt)D(t))dt S A« + f r r" r 'w(S(0)d<. (4) 
Jo Jo 

where the numeraire (y) has a constant price over time. A person 
maximizes his utility in equation (3) subject to this budget constraint 
and to the investment equation (2). The value (in utility terms) of the 
optimal solution, V(A 0 , So. w > p)> gives the maximum obtainable utility 
from initial assets A () , initial stock of capital So, the earnings function 
w(S), and a price structure p(t). Since u(') and u'(S) are concave func¬ 
tions, V(A 0 , So, p) is concave in A 0 and So- If P = dV/dA () , then by 
concavity dfi/dAo £ 0. 

The optimal paths of y(t) and c(t) are determined by the first-order 
conditions. If we let 

a(t) = jV<" + 8 ><*-V*r + p jV (r+ 6 ,(T -‘WT. 


( 5 ) 
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u,(t) = 

hAM) = v4>M)e (a - r) \ ( 6 ) 

« f (0 = |i p e (t)e lo ~ r)t ~ a(l) - n f (<). 

The expression a(t) represents the discounted utility and monetary 
cost or benefit of additional consumption of c through the effect on 
future stocks. It measures the shadow price of an additional unit of 
stock. A rational person recognizes that consumption of a harmful 
good («„ u’i < 0) has adverse effects on future utility and earnings, 
while consumption of a beneficial good (u s , w s > 0) has positive effects 
on future utility and earnings. The shadow, or full, price of c(t), II f (<), 
equals the sum of its market price and the money value of the future 
cost or benefit of consumption (see also Stigler and Becker 1977, eq. 
(8J). The stock component of the full price is itself endogenously 
determined by the optimal path, and yet it can also be said to help 
determine the optimal path by affecting the cost of c. 

Clearly, if future consumption is held fixed, the absolute value of 
a(t ) is smaller when the depreciation rate on past consumption (5) and 
the rate of preference for the present (a) are greater. This suggests 
that consumption of a harmful c is larger, and consumption of a 
beneficial c is smaller, when 8 and a are greater. We will see that 8 and 
a are also important in determining whether c is addictive. 

It is clear from the second first-order condition that the optimal 
expenditure on endogenous depreciation (D) to reduce the stock of 
capital is larger, or the optimal expenditure on endogenous apprecia¬ 
tion to increase the stock is smaller, when the marginal value of the 
stock, a(t ), is smaller. This value falls as the stock increases since the 
value function is concave in S. Therefore, individuals will take steps to 
depreciate the stock more rapidly when it is larger. 

III. Dynamics 

The first-order conditions (5) determine the initial consumption level 
of c, co, as a function of the initial stock of consumption capital, S«, 
prices p(t), and the marginal utility of wealth jt. To simplify the dis¬ 
cussion of dynamics, we first assume an infinite life (T = «), a rate of 
time preference equal to the rate of interest (or = r), and no endoge¬ 
nous depreciation (D(t) = 0). Since fi remains constant over time, the 
relations between cq and S« for given |x and p also give the relation 
over time between c and S for these values of |x and p. 

To analyze the dynamic behavior of c and S near a steady state, we 
can either take linear approximations to the first-order conditions or 
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assume quadratic utility and earnings functions that have linear first* 
order conditions. (Related dynamics were developed by Ryder and 
Heal [1973] and Boyer [1983].) If the utility function u is quadratic in 
c, y, and S, if earnings are quadratic in S, and if p c (t) = p c for all t, then 
the value function is also quadratic. By optimizing y out with its first- 
order condition, we obtain a function that is quadratic only in c(l) and 
S(t): 

Fit) = *At) + «sS(t) + [c{t)f + SjL [S(l)] s 

(7) 

+ a cs c(t)S(t) - tuprcit), 

where the coefficients a, and a„ depend on the coefficients of both the 
utility and earnings functions. We know that a u < 0 and a„ < 0 by 
concavity of the u and w functions. Then the optimization problem 
involves only c(t) and S(t): 

V(A 0 , So, pr) = h + max [ e~ a ‘F[S(t), (8) 

c, S Jo 

where k is a constant that depends on A ( >, p., a, and the coefficients for 
y in the quadratic utility function. The maximization occurs subject to 
equation (2) with h - 0 and to the transversality condition 

lim ^"'[S^)] 2 = 0. (9) 


Equation (8) is a straightforward maximization problem in the cal¬ 
culus of variations, where F is a function only of S and S through the 
linear relation between c, S, and S in equation (2). The Euler equation 
can be expressed as 

S ~ oS - BS - ( ° t 8 >°- - <». tM±. 00 ) 

with 


B = 8((r + 8) + — + (a + 28) —. 

ot rr a rr 


(in 


This is a second-order linear differential equation in S[t), with two 
roots given by 


ct ± Vcr 2 + 4 B 
2 


( 12 ) 


The term under the radical is positive because essentially it is a qua¬ 
dratic form in o - + 28 and 2: 


o 2 + 4B = — [(o- + 26) a a rf + 4<x JS + 4(a + 28)a„] >0, (13) 

<*rr 
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and the Hessian of the concave function F is negative definite. Hence 
both roots of (12) are real. Moreover, the larger root exceeds o72 and 
can be ignored with an infinite horizon; otherwise, [c(f)] 2 would even¬ 
tually grow at a faster rate than or, which would violate the transversal- 
ity condition in equation (8). 

The optimal path of the capital stock is determined from the initial 
condition and the smaller root alone: 

S(t) = deF'‘ + S * with X, = rf = S 0 - S*. (14) 

If the steady state, S*, is stable, S grows over time to S* if So < S* and 
declines over time to S* if S 0 > S*. Equation (14) shows that S* is 
stable if and only if B > 0 because then X t < 0. 

Equation (14) also implies that 

c(t) = (8 + ki)S(t) - \ X S*. (15) 

The slope between c and S increases as Xj increases, and it reaches a 
maximum value when Xi = cr/2, that is, when a 2 + 4B = 0. Given the 
definition of Xi in equation (14) and of B in equation (11), equation 
(15) implies that c and S are positively related (X| > -8), negatively 
related (X! < -8), or unrelated (Xj = -8) as 

(a + 28)a„ § -a„ > 0. (16) 

Since “unrelated” means that past consumption of c has no effect 
on its present consumption, behavior would then be the same as when 
preferences are additively separable over time in c and y, even though 
the utility function is nonseparable in S and c. Whether behavior is 
effectively separable over time depends not only on the current- 
period utility and earnings functions but also on time preference and 
the rate of depreciation of past consumption. 

The line sV in figure l has a stable steady state at 8S* 1 = c*\ 
whereas the line s°s° has an unstable steady state at 8S*° = c*°. The 
arrows indicate that deviations from S* 1 cause a return to S* 1 along 
the linear path sV. Deviations from S*° cause further deviations in 
the same direction along the linear path s°s°. 


IV. Adjacent Complementarity and Addiction 

If the marginal utility of c in the F function is greater when the stock 
of consumption capital (S) is greater (o„ > 0), the marginal utility of c 
would rise over time if 5 rose over time. Consumption of c, however, 
might still fall over time because the full price of c (IT, in eq. [6]) also 
rises over time since a„ < 0. The rise in full price would be larger 
when the function F is more concave in S (a„ is larger in absolute 
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value for a given value of a„), when the future is less heavily dis¬ 
counted (a is smaller), and when depreciation of past consumption (8) 
is less rapid. The increase over time in the marginal utility of c would 
exceed the increase in full price if and only if the left-hand side of 
equation (16) exceeds the right-hand side. There is said to be “adja¬ 
cent complementarity” when this inequality holds (the concepts of 
adjacent and distant complementarity were introduced by Ryder and 
Heal [1973]). 

The basic definition of addiction at the foundation of our analysis is 
that a person is potentially addicted to c if an increase in his current 
consumption of c increases his future consumption of c. This occurs if 
and only if his behavior displays adjacent complementarity. This 
definition has the plausible implication that someone is addicted to a 
good only when past consumption of the good raises the marginal 
utility of present consumption (a„ > 0). However, such an effect on 
the marginal utility is necessary but is by no means sufficient even for 
potential addiction since*potential addiction also depends on the 
other variables in equation (16). 

The relation between addiction and adjacent complementarity was 
first recognized by Boyer (1983) and lannaccone (1986). Boyer con¬ 
siders discrete time and the special case in which (in our notation) S, 
- c ( _ i. The distinction between adjacent complementarity and the 
effect of S on the marginal utility of c is not interesting analytically 
in that case because the sign of a„ is then the sole determinant of 
whether past and present consumption are complements or substi¬ 
tutes. 

Experimental and other studies of harmful addictions have usually 
found reinforcement and tolerance (Donegan et al. 1983). Reinforce- 
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tnent means that greater current consumption of a good raises its 
future consumption. Reinforcement is closely related to the concept 
of adjacent complementarity. Tolerance means that given levels of 
consumption are less satisfying when past consumption has been 
greater. Rational harmful addictions (but not beneficial addictions) do 
imply a form of tolerance because higher past consumption of harm¬ 
ful goods lowers the present utility from the same consumption level. 

According to our definition of addiction, a good may be addictive 
to some persons but not to others, and a person may be addicted to 
some goods but not to other goods. Addictions involve an interaction 
between persons and goods. For example, liquor, jogging, cigarettes, 
gambling, and religion are addictive to some people but not to others. 
The importance of the individual is clearest in the role of time pref¬ 
erence in determining whether there is adjacent complementarity. 
Our analysis implies the common view that present-oriented individ¬ 
uals are potentially more addicted to harmful goods than future- 
oriented individuals. The reason for this is that an increase in past 
consumption leads to a smaller rise in full price when the future is 
more heavily discounted. 

The rate of depreciation of past consumption (8), complementarity 
between present and past consumption (a CJ ), and the effect of changes 
in the stock of consumption capital on earnings depend on the indi¬ 
vidual as well as on the good. For example, drunkenness is much 
more harmful to productivity in some jobs than in other jobs. 

Whether a potentially addictive person does become addicted de¬ 
pends on his initial stock of consumption capital and the location of 
his demand curve. For example, the curves that relate c and 5 in 
figure 1 display adjacent complementarity, yet the person with these 
relations would ultimately abstain from consuming c if So < S*° and 
s°s° is relevant. We postpone until Section VI a discussion of the 
determinauon of the initial stock of consumption capital and the loca¬ 
tion of demand functions. 

The smaller root (\j) in (12) is larger in algebraic value when the 
degree of adjacent complementarity increases because of increases in 
cr, 8, or a„. This root along with the larger root would be positive if 
adjacent complementarity is sufficiently strong to make B < 0. The 
steady state is then unstable: consumption grows over time if initial 
consumption exceeds the steady-state level, and it falls to zero if initial 
consumption is below that level. 

Unstable steady states are not an analytical nuisance to be elimi¬ 
nated by appropriate assumptions, for they are crucial to the under¬ 
standing of rational addictive behavior. The reason is that an increase 
in the degree of potential addiction (i.e., an increase in the degree of 
adjacent complementarity) raises the likelihood that the steady state is 
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unstable. Moreover, there must be adjacent complementarity in the 
vicinity of an unstable steady state because the curve that relates c and 
S must cut the positively sloped steady-state line from below at unsta¬ 
ble points; see point (c*°, S*°) in figure 1. Unstable steady states are 
needed to explain rational “pathological" addicdons, in which a per¬ 
son’s consumption of a good continues to increase over time even 
though he fully anticipates the future and his rate of time preference 
is no smaller than the rate of interest. However, they are also impor¬ 
tant in explaining “normal” addictions that may involve rapid in¬ 
creases in consumption only for a while. 

Unstable steady states also lead to another key feature of addic¬ 
tions: multiple steady states. Quadratic utility and earnings functions 
cannot explain multiple steady states because they imply the linear 
relation between c and S in equation (16). However, if a quadratic 
function were only a local approximation to the true function near a 
steady state and if the true function, say, had a cubic term in S 3 with a 
negative coefficient added to a quadratic function, the first-order con¬ 
ditions in equation (6) would then generally imply two interior steady 
states, one stable and one unstable. The negative coefficient for S 3 
means that the degree of adjacent complementarity declines as S in¬ 
creases (see curve p l p l in fig. 1) so that the level of c is smaller at the 
unstable steady state (c*°, S*°) than at the stable steady state (c*\ S* 1 ). 

With two steady states, relatively few persons consistently consume 
small quantities of addictive goods. Consumption diverges from the 
unstable state toward zero or toward the sizable steady-state level. 
Therefore, goods that are highly addictive to most people tend to 
have a bimodal distribution of consumption, with one mode located 
near abstention. Cigarettes and cocaine consumption are good exam¬ 
ples of such bimodality. The distribution of alcohol consumption is 
more continuous presumably because alcoholic beverages are not ad¬ 
dictive for many people. 

This paper relies on a weak concept of rationality that does not rule 
out strong discounts of future events. The consumers in our model 
become more and more myopic as time preference for the present (o) 
gets larger. The definition of a(t) in equation (5) shows that the pres¬ 
ent value of the cost of an increase in the current consumption goes to 
zero as o goes to infinity (if the interest rate equals cr). It is then 
“rational” to ignore the future effects of a change in current con¬ 
sumption. 

The definition of adjacent complementarity in equation (16) makes 
clear that time preference for the present is not necessary for addic¬ 
tion. However, fully myopic consumers (cr = °°) do have the potential 
to become addicted whenever an increase in past consumption raises 
the marginal utility of current consumption (o„ > 0). Although fully 
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myopic behavior is formally consistent with our definition of rational 
behavior, should someone who entirely or largely neglects future con¬ 
sequences of his actions be called rational? Some economists and phi¬ 
losophers even suggest that rationality excludes all time preference. 

Fortunately, we can reinterpret <r so that it may be positive even 
when individuals have neutral time preferences. If lives are finite, the 
inverse of the number of years of life remaining is an approximation 
to the rate of "time preference" for people who do not discount the 
future. Then old people are rationally “myopic” because they have 
few years of life remaining. Other things the same, therefore, older 
persons are less concerned about the future consequences of current 
consumption, and hence they are more likely to become addicted. Of 
course, other things are not usually the same: older people are less 
healthy and subject to different life cycle events than younger people. 
Moreover, people who manage to become old are less likely to be 
strongly addicted to harmful goods. 

To simplify the discussion, most of the paper assumes that a * r, 
but the analysis also has novel implications about the consequences of 
changes in cr relative to r. When utility functions are separable over 
time, an increase in preference for the present compared with the 
interest rate raises current consumption and reduces future con¬ 
sumption. This intuitive conclusion may not apply with addictive 
goods because the full cost of an addictive good depends on the 
degree of time preference. Indeed, if the degree of addiction is suffi¬ 
ciently strong, a higher o is likely to raise the growth over time in 
consumption of the addictive good (see the fuller discussion in Becker 
and Murphy [1986, sec. 8]). This steepening of the consumption 
profile over time as time preference increases is contrary to the intui¬ 
tion built up from prolonged consideration of separable utility func¬ 
tions, but it is not contrary to any significant empirical evidence. 

We follow Stigler and Becker (1977) in distinguishing harmful 
from beneficial addictions by whether consumption capital has nega¬ 
tive or positive effects on utility and earnings. Since the definitions of 
adjacent complementarity and addiction do not depend on first 
derivatives of the utility and earnings functions, they apply to both 
harmful and beneficial addictions. For example, increases in cr and 8 
raise the degree of adjacent complementarity, and hence they raise 
the extent of potential addiction to both beneficial and harmful 
goods. 

The stock component of full price—the term a(t) in equation (5)— 
does depend on the signs of u s and w s : a future cost is added to the 
current market price of harmful addictive goods, whereas a future 
benefit is subtracted from the current price of beneficial goods. 
Therefore, an increase in the rate of preference for the present and 
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in the depreciation rate on consumption capital raises the demand for 
harmful goods but lowers the demand for beneficial goods. As a 
result, drug addicts and alcoholics tend to be present-oriented, while 
religious individuals and joggers tend to be future-oriented. 


V. Permanent Changes in Price 

A permanent decline in the price of c, p c , that is compensated to 
maintain the marginal utility of wealth (p.) constant would raise c(t) 
because the value function is concave. Moreover, 

± rac(*) 1 jc_ j_ (dc_ _ dc bs , a a ( dc \ n7 . 
at [ J &p c »p e l ds *) ds dp, a dp c [ ds )■ u 

The second term on the far right-hand side is zero in the vicinity of a 
steady state because S equals zero at the steady state. The sign of the 
first term is the opposite of the sign of dc/dS because p c has a negative 
effect on c(t) and hence on S. By definition, the sign of dc/dS is positive 
with adjacent complementarity, zero with independence, and nega¬ 
tive with adjacent substitution. 

Therefore, the effect of a compensated change in p c on c grows over 
time when present and past consumption are adjacent complements; 
that is, the effect grows over time for addictive goods. A permanent 
change in the price of an addictive good may have only a small initial 
effect on demand, but the effect grows over time until a new steady 
state is reached (assuming that consumption eventually approaches a 
stable state). 

Indeed, if the utility function is quadratic, the long-run effect on 
consumption of a permanent change in price tends to be larger for 
addictive goods. To show this, differentiate the first-order conditions 
in equations (6) with a quadratic utility function to get the change in 
consumption between stable steady states: 


= n S(o + S) 0 
dp r o„ B 


( 18 ) 


The denominator is negative near stable steady states because «v < 0 
and B > 0 (see eq. [14]). Since greater addiction lowers B, greater 
addiction raises the long-run effect on consumption of a change in 
own price. 

Long-run elasticities would be proportional to the slopes in equa¬ 
tion (18) if initial steady-state consumption were independent of B. 
Changes in B need not affect the initial steady-state consumption 
because steady-state consumption is determined by first derivatives of 
the utility and wage functions that do not affect B. 

The full effect of a finite change in price on the aggregate con- 
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sumption of addictive goods could be much greater than the effect in 
equation (18) because of unstable steady states. In figure 2, all house¬ 
holds with initial consumption capital between S* 2 and S* 1 would be 
to the left of the unstable state when p c - p l and the relevant curve is 
p l p 1 , but they would be to the right of the unstable state when/> r = p 2 
and the relevant curve is p 2 p 2 - Hence a reduction in price from p l to 
p l greatly raises the long-run demand for c by these households. 

Smoking and drinking are the only harmful addictions that have 
been extensively studied empirically. Mullahy (1985, chap. 2) reviews 
many estimates of the demand for cigarettes and shows that they are 
mainly distributed between .4 and .5. Estimates that implement our 
model of addiction imply long-run price elasticities for cigarettes of 
about .6 (see Becker, Grossman, and Murphy 1987). This is not small 
compared to elasticities estimated for other goods. Price elasticities 
for alcoholic beverages appear to be higher, especially for liquor (see 
the studies reviewed in Cook and Tauchen [1982]). 

The aggregate demand for drinking and smoking could be quite 
responsive to price, and yet the most addicted might have modest 
responses. Fortunately, Cook and Tauchen consider the effect of the 
cost of liquor on heavy drinking as well as on the aggregate amount oi 
drinking. They measure heavy drinking by the death rate from cir¬ 
rhosis of the liver (heavy drinking is a major cause of death from this 
disease). They conclude that even small changes in state excise taxes 
on liquor have a large effect on death rates from this disease. This 
suggests either that heavy drinkers greatly reduce their consumption 
when liquor becomes more expensive or that the number of individ¬ 
uals who become heavy drinkers is sensitive to the price of alcohol. 

Heroin, cocaine, gambling, and other harmfully addictive goods 
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are often illegal; beneficially addictive goods, such as particular reli¬ 
gions or types of music, are also sometimes banned. Banned goods 
become more expensive when the ban is supported by punishments to 
consumers and producers. Our analysis implies that the long-run 
demand for illegal heroin and other illegal addictive drugs tends to be 
much reduced by severe punishments that greatly raise their cost. 
However, the demand for banned addictive goods may not respond 
much to a temporary rise in price due to a temporary burst of active 
law enforcement or during the first year after a permanent ban is 
imposed. 

The full price of addictive goods to rational consumers includes the 
money value of changes in future utility and earnings induced by 
changes in current consumption. The information that began to be¬ 
come available in the late 1950s on the relation between smoking and 
health provides an excellent experiment on whether persons addicted 
to smoking consider delayed harmful consequences or whether, in¬ 
stead, they are myopic. Ippolito, Murphy, and Sant (1979) estimate 
that 11 years after the first Surgeon General’s report on smoking in 
1964, per capita consumption of cigarettes and of tar and nicotine 
had been reduced by 34 percent and 45 percent, respectively. This 
evidence blatantly contradicts the view that the majority of smokers 
were myopic and would not respond to information about future 
consequences because they discounted the future heavily. 

Of course, persons who continued to smoke, and those who began 
to smoke after the new information became available, might be more 
myopic than quitters and persons who did not begin to smoke. One 
explanation for the much stronger negative relation between smoking 
and education in the 1970s and 1980s than prior to the Surgeon 
General's report is that more educated people tend to have lower 
rates of preference for the present. Presumably, this is partly why 
they accept the delayed benefits of higher education. Farrell and 
Fuchs (1982) do show that the negative association between education 
and smoking is not fully explained by any effects of education on the 
propensity to smoke. 

The behavior of teenagers is persuasive evidence of forward- 
looking behavior by smokers. Teenagers are often said to be among 
the most impatient (see the questionnaire evidence in Davids and 
Falkoff [1975]). If so, their propensity to smoke should be hardly 
affected by health consequences delayed for 20 or more years, al¬ 
though parental disapproval may have a big effect. Yet smoking rates 
of males between ages 21 and 24 declined bv over one-third from 
1964 to 1975 (see Harris 1980). 

The long-run change in the consumption of addictive goods due to 
a change in wealth also exceeds the short-run change because the 
stock of consumption capital would change over time until a new 
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steady state is reached (Spinnewyn (1981, p. 101] has a similar result 
for wealth effects). By differentiating the first-order condidons in 
equation (6) with respect to p,, the marginal utility of wealth, we get 
the response of steady-state consumpdon to a change in wealth (if the 
utility function is quadratic): 



Since p. and wealth are negadvely related, c is a superior or inferior 
good as dc*ld\x. $ 0. If wealth rises because of an increase in earnings, 
the term detjd\i is likely to be posidve for harmfully addictive goods 
because the negative effect on earnings of increased consumpdon is 
likely to be greater when earnings are greater (d\uwjd\i. > 0). For 
example, heavy drinking on the job reduces the productivity of an 
airline pilot or doctor more than that of a janitor or busboy. Equation 
(19) shows that c would be an inferior good if the negative effect on 
earnings were sufficiently large. Therefore, the spread of informa¬ 
tion about the health hazards of smoking should have reduced the 
income elasticity of smoking, and it could have made smoking an 
inferior good. This elasticity apparently did decline after the 1960s to 
a negligible level (see Schneider, Klein, and Murphy 1981). Since 
women earn less than men, this may help explain why smoking by 
women has grown relative to smoking by men during the past 25 
years. 


VI. Temporary Changes in Price and Life 
Cycle Events 


If utility and earnings functions are quadratic, the demand for c at 
each moment in time can be explicitly related to the initial S and to 
past and future prices of c (see eq. [A2] in App. A). Both past and 
future prices affect current consumption, but the effects are not sym¬ 
metrical. Changes in past prices affect current consumption by chang¬ 
ing the current stock of consumption capital, whereas changes in 
future prices affect current consumption by changing current full 
prices through the effects on future stocks and future consumption. 

The effect on consumption of a differential change in price over a 
small interval: divided by the length of this interval has a nonzero limit 
as the length of the interval goes to zero. Equation (A2) implies that 
these limits are 


dc(t) m (8 > Xi)g~ XiT 
dp(T) ~ 2 - X,) 


[(s + 


(8 + X,^“] 


for t > t, 

( 20 ) 
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a ~ (Ia [(8 + k2)e ~ K ‘ 7 ~ (h + Xl)e “^ ] for 1 > T > 

( 21 ) 


where changes in price are compensated to hold the marginal utility 
of wealth constant, and X 2 and Xj are the larger and smaller roots of 
equation (12). 

The important implication of these equations is that the signs of 
both cross-price derivatives depend only on the sign of 6 + X t . Section 
III shows that this term is positive with adjacent complementarity and 
negative with adjacent substitution. Since the terms in brackets are 
always positive and a ee is negative, dc(t)/dp(r) will be negative if and 
only if 8 + X] is positive. Hence, adjacent complementarity is a neces¬ 
sary and sufficient condition for negative compensated cross-price 
effects. 

A negative cross derivative when marginal utility of income is held 
constant is a common definition of complementarity in consumption 
theory. Therefore, adjacent complementarity is a necessary and suffi¬ 
cient condition for present and future consumption, and for present 
and past consumption, to be complements. This conclusion strongly 
qualifies the claim by Ryder and Heal that adjacent “complementarity 
... is different from complementarity in the Slutsky sense” (1973, p. 
4). Since our definition of potential addiction is linked to adjacent 
complementarity, a good is addictive if and only if consumption of the 
good at different moments in time are complements. Moreover, the 
degree of addiction is stronger when the complementarity in con¬ 
sumption is greater. 

The link between addiction and complementarity implies that an 
anticipated increase in future prices of addictive goods lowers current 
consumption. These negative effects of anticipated future price 
changes on the present consumption of addictive goods are a major 
way to distinguish rational addiction or rational habit formation from 
myopic behavior (myopic behavior is assumed, e.g., by Poliak [1970, 
1976], von Weizsacker [1971], and Phlips [1974]). 

The longer that future price changes are anticipated, the bigger is 
their effect on the current consumption of addictive goods. In equa¬ 
tion (20), where t > t, an increase in t (with t - t held constant) 
increases the absolute value of dc(t)ldp( t) if X t + 8 > 0. The reason is 
that the longer a future price rise of an addictive good is anticipated, 
the greater is the reduction in past consumption of the good. There¬ 
fore, the smaller would be the stock of capital carried into the present 
period. We are not merely restating the familiar result that elasticities 
of demand are greater when price changes are anticipated since the 
elasticity for goods with adjacent substitution (Xi + 8 < 0) is smaller 
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when future price changes have been anticipated for a longer period 
of time. 

Equation (21) shows that recent past prices have larger effects on 
current consumption than more distant past prices when steady states 
are stable. However, with an unstable steady state, changes in con¬ 
sumption at one point in time lead to larger and larger changes in 
future consumption since consumption capital continues to grow. 

The permanent changes in stationary price considered in Section V 
can be said to combine changes in price during the present period 
with equal changes in price during all future periods. Since a (com¬ 
pensated) future price increase of an addictive good reduces its cur¬ 
rent consumption, an increase only in its current price has a smaller 
effect on current consumption than a permanent increase in its price. 

The complementarity between present and future consumption is 
larger for more addictive goods. Therefore, permanent changes in 
prices of addictive goods might have large effects on their current 
consumption. Although our analysis implies that rational addicts re¬ 
spond more to price in the long run than in the short run, they may 
also respond a lot in the short run. 

The beginning and resumption of harmful addictions, such as 
smoking, heavy drinking, gambling, cocaine use, and overeating, and 
of beneficial addictions, such as religiosity and jogging, are often 
traceable to the anxiety, tension, and insecurity produced by adoles¬ 
cence, marital breakup, job loss, and other events (see the many stud¬ 
ies reviewed in Peele [1985, chap. 5)). This suggests that consumption 
of many harmfully addictive goods is stimulated by divorce, unem¬ 
ployment, death of a loved one, and other stressful events. If these 
events lower utility while raising the marginal utility of addictive 
goods, then changes in life cycle events have the same effect on con¬ 
sumption as changes in prices (see App. A). For example, a compen¬ 
sated increase in stress during a future finite time interval raises fu¬ 
ture c’s and future S’s. The same reasoning used to show that declines 
in future prices raise present consumption of addictive goods shows 
that anticipated future stress raises the current consumption of addic¬ 
tive goods if it raises future consumption. 

Therefore, even persons with the same utility function and the 
same wealth who face the same prices may have different degrees of 
addiction if they have different experiences. However, to avoid the 
unattractive implication of equation (2) that all persons who never 
consumed an addictive good—such as teenagers who never 
smoked—would have a zero initial stock of consumption capital, we 
assume that $ome events directly affect the stock of consumption 
capital. If Z(t}h the rate of such events at time t, the stock adjustment 
equation would be changed to 



RATIONAL ADDICTION 


691 

S - C(t) + Z(t) - bS(t). (22) 

Even if c had not been consumed in the past, S would vary across 
individuals because of different experiences (Z). Appendix A ana¬ 
lyzes the effects of past and future Z’s on the current consumption 
of c. 

Temporary events can permanently “hook” rational persons to ad¬ 
dictive goods. For example, a person may become permanently ad¬ 
dicted to heroin or liquor as a result of peer pressure while a teenager 
or of extraordinary stress while fighting in Vietnam. If adolescence or 
a temporary assignment in Vietnam raises demand for c in figure 2 
from ci to c a , he would temporarily move along the path p 2 p 2 from c 2 
to c a . At that point—when the stress ceases—he abruptly returns to 
the path p l p l at c\. (Note that his consumption during the stressful 
events is affected by how temporary they are.) In this example, he 
accumulates sufficient capital while under stress to remain hooked 
afterward. Starting at Ci, he would have eventually abstained if he had 
never been subject to such stress, but instead he ends up with a sizable 
steady-state consumption. Although most Vietnam veterans did end 
their addiction to drugs after returning to the United States, many 
did not, and others shifted from dependence on drugs to dependence 
on alcohol (see Robins et al. 1980). 

Some critics claim that the model in Stigler and Becker (1977)— 
presumably also the model in this paper—is unsatisfactory because it 
implies that addicts are “happy,” whereas real-life addicts are often 
discontented and depressed (see, e.g., Winston 1980). Although our 
model does assume that addicts are rational and maximize utility, they 
would not be happy if their addiction results from anxiety-raising 
events, such as a death or divorce, that lower their utility. Therefore, 
our model recognizes that people often become addicted precisely 
because they are unhappy. However, they would be even more un¬ 
happy if they were prevented from consuming the addictive goods. 

It might seem that only language distinguishes our approach to the 
effects of events on addictions from approaches based on changes in 
preferences. But more than language is involved. In many of these 
other approaches, different preferences or personalities fight for con¬ 
trol over behavior (see Yaari 1977; Elster 1979; Winston 1980; Schel- 
ling 1984). For example, the nonaddictive personality makes commit¬ 
ments when in control of behavior that try to reduce the power of the 
addictive personality when it is in control. The nonaddictive personal¬ 
ity might join Alcoholics Anonymous, enroll in a course to end smok¬ 
ing, and so forth (see the many examples in Schelling [1984]). By 
contrast in our model, present and future consumption of addictive 
goods are complements, and a person becomes more addicted at pres* 
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ent when he expects events to raise his future consumption. That is, 
in our model, both present and future behavior are part of a consis¬ 
tent maximizing plan. 

VII. Cold Turkey and Binges 

Our theory of rational addiction can explain why many severe addic¬ 
tions are stopped only with “cold turkey,” that is, with abrupt cessa¬ 
tion of consumption. Indeed, it implies that strong addictions end only 
with cold turkey. A rational person decides to end his addiction if 
events lower either his demand for the addictive good sufficiently or 
his stock of consumption capital sufficiently. His consumption de¬ 
clines over time more rapidly when a change in current consumption 
has a larger effect on future consumption. The effect on future con¬ 
sumption is larger when the degree of complementarity and the de¬ 
gree of addiction are stronger. Therefore, rational persons end stron¬ 
ger addictions more rapidly than weaker ones. 

If the degree of complementarity and potential addiction became 
sufficiendy strong, the utility function in equation (1) would no longer 
be concave. Appendix B shows that the relation between present 
consumption of an addictive good (c) and its past consumption (5) can 
then become discontinuous at a point (§) (see fig. 2), such that c > 8$ 
when S > S and e < 85 when S < S. Although S is not a steady-state 
stock, it plays a role similar to that of an unstable steady-state stock 
when the utility function is concave. If S is even slightly less than $, 
consumption falls along p 3 to zero over time. Similarly, if 5 is even 
slightly above §, consumption rises over time along p 3 , perhaps to a 
stable level. However, a decline in S from j’ust above to just below S 
causes an infinite rate of fall in c because the relation between c and S 
is discontinuous at 5. Indeed, with a sufficiently large discontinuity, 
an addict who quits would use cold turkey; that is, he would im¬ 
mediately stop consuming once he decides to stop. 

The explanation of this discontinuity is straightforward. If S is even 
slightly bigger than §, the optimal consumption plan calls for high c in 
the future because the good is highly addictive. Strong complemen¬ 
tarity between present and future consumption then requires a high 
level of current c. If S is even slightly below 5, future c will be very low 
because the addiction ends quickly, and strong complementarity then 
requires a low level of current c. 

Clearly, then, quitting by cold turkey is not inconsistent with our 
theory of rational addiction. Indeed, our theory even requires strong 
addictions to terminate with cold turkey. Moreover, when com¬ 
plementarity is sufficiently strong to result in a nonconcave utility 
function, we generate sharp swings in consumption in response to 
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small changes in the environment when individuals either are begin¬ 
ning or are terminating their addiction. 

The short-run loss in utility from stopping consumption gets bigger 
as an addiction gets stronger. Yet we have shown that rational persons 
use cold turkey to end a strong addiction even though the short-run 
“pain” is considerable. Their behavior is rational because they ex¬ 
change a large short-term loss in utility for an even larger long-term 
gain. Weak wills and limited self-control are not needed to under¬ 
stand why addictions to smoking, heroin, and liquor can end only 
when the consumption stops abruptly. 

A rational addict might postpone terminating his addiction as he 
looks for ways to reduce the sizable short-run loss in utility from 
stopping abrupdy. He may first try to stop smoking by attending a 
smoking clinic but may conclude that this is not a good way for him. 
He may try to substitute gum chewing and jogging for smoking. 
These too may fail. Eventually, he may hit on a successful method that 
reduces the short-term loss from stopping. Nothing about rationality 
rules out such experiments and failures. Indeed, rationality implies 
that failures will be common with uncertainty about the method best 
suited to each person and with a substantial short-run loss in utility 
from stopping. 

The claims of some heavy drinkers and smokers that they want to 
but cannot end their addictions seem to us no different from the 
claims of single persons that they want to but are unable to marry or 
from the claims of disorganized persons that they want to become 
better organized. What these claims mean is that a person will make 
certain changes—for example, marry or stop smoking—when he 
finds a way to raise long-term benefits sufficiently above the short¬ 
term costs of adjustment. 

“Binges” are common in alcoholism, overeating, and certain other 
addictions. We define a binge as a cycle over time in the consumption 
of a good. Binging may seem to be the prototype of irrational behav¬ 
ior, yet a small extension of our model makes binging consistent with 
rationality. 

Consider overeating. Weight rises and health falls as earing in¬ 
creases. We assume that two stocks of consumption capital determine 
current eating: call one stock weight and the other “eating capital.” 
Our analysis so far, in effect, has absorbed weight and eating capital 
into a single stock (S). We readily generate binges if the two stocks 
have different depreciation rates and different degrees of com¬ 
plementarity and substitutability with consumption. 

To get cycles of overeating and dieting, one stock (say eating capi¬ 
tal) must be complementary with eating and have the higher depreci¬ 
ation rate, while the other stock (weight) must be substitutable (see eq. 
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[C8] in App. C). Assume that a person with low weight and eating 
capital became addicted to eating. As eating rose over time, eating 
capital would rise more rapidly than weight because it has the higher 
depreciation rate. 

Ultimately, eating would level off' and begin to fall because weight 
continues to increase. Lower food consumption then depreciates the 
stock, of eating capital relative to weight, and the reduced level of 
eating capital keeps eating down even after weight begins to fall. 
Eating picks up again only when weight reaches a sufficiendy low 
level. The increase in eating then raises eating capital, and the cycle 
begins again. These cycles can be either damped or explosive (or 
constant) depending on whether the steady state is stable or unstable. 

Although, as is usual in such problems, two capital stocks are 
needed to get cycles (Ryder and Heal [1973] also get cycles in con¬ 
sumption with two capital stocks), these stocks in our analysis have a 
plausible interpretation in terms of differences in rates of deprecia¬ 
tion and degrees of complementarity and substitutability. In our anal¬ 
ysis, binges do not reflect inconsistent behavior that results from the 
struggle among different personalities for control. Rather, they are 
the outcome of consistent maximization over time that recognizes the 
effects of increased current eating on both future weight and the 
desire to eat more in the future. 

VIII. Summary and Conclusions 

In our theory of rational addiction, “rational” means that individuals 
maximize utility consistently over time, and a good is potentially ad¬ 
dictive if increases in past consumption raise current consumption. 
We show that steady-state consumption of addictive goods is unstable 
when the degree of addiction is strong, that is, when the complemen¬ 
tarity between past and current consumption is strong. Unstable 
steady states are a major tool in our analysis of addictive behavior. 
Consumption rises over time when above unstable steady-state levels, 
and it falls over time, perhaps until abstention, when below unstable 
steady states. 

Addictions require interaction between a person and a good. Obvi¬ 
ously, cigarettes and heroin are more addictive than sweaters and 
sherbet. Yet not all smokers and heroin users become addicted. We 
show that, other things the same, individuals who discount the future 
heavily are more likely to become addicted. The level of incomes, 
temporary stressful events that stimulate the demand for addictive 
goods, and the level and path of prices also affect the likelihood of 
becoming addicted. 

Permanent changes in prices of addictive goods may have a modest 
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short-run effect on the consumption of addictive goods. This could be 
the source of a general perception that addicts do not respond much 
to changes in price. However, we show that the long-run demand for 
addictive goods tends to be more elastic than the demand for nonad- 
dictive goods. 

Anticipated future increases in price reduce current consumption 
of addictive goods because their consumption at different moments 
of time are complements. This implies that temporary changes in the 
price of an addictive good have smaller effects on current consump¬ 
tion than (compensated) permanent changes. 

Strong addictions to smoking, drinking, and drug use are usually 
broken only by going cold turkey, that is, by abrupt withdrawal. The 
need for cold turkey may suggest a weak will or other forms of less- 
than-rational behavior. Yet we show that cold turkey is consistent with 
rational behavior. Indeed, rational persons end strong addictions 
only with rapid and sometimes discontinuous reductions in consump- 
ion. 

Addiction is a major challenge to the theory of rational behavior. 
Not only cigarettes, alcoholic beverages, and cocaine are obviously 
addictive, but many other goods and activities have addictive aspects. 
We do not claim that all the idiosyncratic behavior associated with 
particular kinds of addictions are consistent with rationality. How¬ 
ever, a theory of rational addiction does explain well-known features 
of addictions and appears to have a richer set of additional implica¬ 
tions about addictive behavior than other approaches. This is the 
challenge posed by our model of rational addiction. 


Appendix A 

If the utility function is quadratic, if the events Z(l) affect the stock of con¬ 
sumption capital (see eq. [22]), and if the events £(<) affect the utility function, 
then the capital stock is a solution to the differential equation 

S(t) - aS(t) - BS(t) = h(t) = a + g * 8 p(t) 

Otrc «r, 

- - 2 ?- E(t) + (o + 8) -2s- E(0 (Al) 

Orr a„ 

+ Z(f) - ^<r + 8 + -^-jz<0, 

where \vp(t) has been replaced by p(t) to save on notation, a„ is the coefficient 
of E{t)c(t), a - (6 + crjfotr/a^) + (a s la n ), and B is given by equation (II). With 
the relation between S and c in equadon (22), the solution to this equauon for 
c{t) that sadsfies the initial condidon S(0) = S° and the transversality condition 
in equation (8) is 
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+ <• + >■)(*• + *)■“ - g^T 

. (S 1 x»K6 + M + (S + x,)(8 + M 

a„(X 2 - Xj) Jo ^ a„(X 2 - Xi) 


.-*w« 


'p(t)dy 


+ ££L - ( s + x iH 8 + x 2 + (««M [' e -x.t 2(T)<fl 

“« X 2 - Xi Jo 

(S + X 2 )[S + X 1 + (a„/a rc )] 


|* e K *- T) p( T)rfr 
(A2) 


X 2 - X| 

(5 -f X|)[8 + Xi 4- (a rJ /a r ,)3 
X 2 — X| 


| C /"2(i)rfr 


Jo 


+ terms for E(t) that equal - a, f times the corresponding terms for 
Pit). 

The definitions of Xi and X 2 in equation (12) together with some simple 
calculations show that 

8 + X, + < 0. 8 + X 2 + > 8 + -2- + ~ > 0. (A3) 

a fr «tr 2 ®ff 

Equation (A2) implies the derivatives in equations (20) and (21) of c(t) with 
respect to p{ t), t > or < l. Essentially identical derivatives of c(<) with respect 
aid, i 


to £(t) hold, so that if a,, > 0, 
*<'> *0»nd 


>PIT) 


SE(t) 


§ 0 as 8 + X t % 0, 


(A4) 


which is the condition for adjacent complementarity; similarly when t > t. 
We also have 


dc(/) 


aZ(r) 


- - * M(s + x 2 + -2«.y x,T - (s + x, + -^2_V- XlT 

!*<' X 2 X) l_\ o rc / \ a„ ) 


§ 0 as 8 + Xi § 0 
via equation (A3). However, 
am 


(A5) 


az(r) 


* + X, + -2s-V' X st [(8 + X 2 )r X2 ' - (8 + x,)/ 1 '] < o. (A6) 

T>» \ Off / 

Therefore, future events that raise the stock have a negative effect on current 
consumption independent of whether there is adjacent complementarity. 


Appendix B 

If the degree of adjacent complementarity is sufficiently strong that the utility 
function is no longer concave in c(t) and S(l), the two roots in equation (12) 
will be complex, and the form of the optimal consumption path will change 
significantly. We consider the case in which the utility function is still concave 
in c(t) and S(t) separately, but it is not jointly concave in c(l) and S(t): 
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a« < 0, a„ < 0, a cr a u < a?,. (Bl) 

These assumptions indicate that a high degree of complementarity between 
past and current consumption—that is, between c(t) and 5(t) —is what creates 
some convexity in the utility function, not a lack of concavity in either c(t) or 
S(t) alone. In regions in which 4fl < -a 2 , both roots of the characteristic 
equation A 2 - oA - B will be complex (see eq. [13]). 

If the roots are complex, the unstable steady state is replaced by a discon¬ 
tinuity in the optimal consumption function that relates consumption c to the 
current stock S. However, as long as the utility function satisfies a rf < 0, then 
this discontinuity will always be of a particular form: c(S) < 8$ to the left of a 
critical stock value $ (3 is not the same stock that satisfies the steady-state 
equation), and c(S ) > 8$ to the right of S. 

If § is above a lower steady state at abstinence, the critical stock can gener¬ 
ate the phenomenon of quitting cold turkey. That is, consumption could fall 
considerably in response to even a small rise in price or a “small” event. 

A simple example may be the best way to illustrate this result. Unfortu¬ 
nately, quadratic utility functions that satisfy the inequalities in equation (Bl) 
have unbounded utility if the horizon is infinite. However, consider the fol¬ 
lowing modified quadratic utility function: 

u(c(0, S(/)) = for c(t) < 0 or c(t) > t (B2) 

so that consumption is restricted to the interval [0, C], and 
u(c(t), S(/)) = aM l ) + a,S(0 + a rs c(t)S(t) + V*a. cr c(lj l for 0 < c(t) < C. 

Although we assume a s , = 0 to simplify the calculations, the basic results 
require only that a 2 > a <r a 4( , and 4 B < a 2 . 

The first-order conditions are 


c(t) = 0, then (a r - p) + a„5(/) + a rr c(t) + q(l) s 0, 

0 < c(t) < C, then (a r - p.) + a n S(l) + a cc c(t) + q(t) = 0, (B3) 

c(t) = C, then (a r - p.) + a r ,S(J) + a cr c(f) + q(t) a 0. 

The term p. is the product of the marginal utility of income and the constant 
price of c, and q(t) is the shadow price of the stock. 

Define $/ > 0 to be the highest stock such that c(t) = 0 satisfies the Euler 
condition for a locally optimal solution. Clearly, Si must satisfy 


(a, - p.) + 0 C ,S, + m 0 or S, = —- Hi ±iHl , { B4) 

or + o a r , 


We assume that S/ is strictly positive; that is, we assume that the cost of 
increasing from a zero stock exceeds the benefits. Similarly, define the small¬ 
est stock. Si,, such that c(1) = C satisfies the Euler equation (and transversality 
condition). This stock is defined by 

s , , 5 , - 4 ^- ♦ sr^). (B5) 

Finally, the stock, S*, that satisfies the steady-state equation is 

(a, - |i) + a„S* + a„85* + a> + = 0. (B6) 

tr + 0 

If a„/(a + 8) = -a„, then Si = S* = S*. In this case, c(t) - 0, c(t) = C, and 
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c (0 *= 6 S* are all solutions to the Euler equation. However, the convexity 
induced by the strong complementarity between c and S implies that lifetime 
utility will be maximized by choosing either c(t) = 0 or c(t) = 6 when the 
initial stock is S*. 


Appendix C 


We assume a quadratic utility function in r, y, and two stocks, and S 2 , where 
Si and S 2 do not interact (an - 0 ). While the steady-state results are similar 
for the two-stock and one-stock models, the dynamics are quite different. To 
simplify notation, we transform the definitions of the stocks and consumption 
so that the steady-state values of c and S are zero. The solution to this stan¬ 
dard control problem is of the form 

•Si (i) = 

S%(t) = + 622^* 

The restriction that both stocks accumulate by the same consumption process 
implies that 

+ 5 j) = d»s» 1 (Ai + 82), d»12<A2 + 81) = <t>2 2 (A 2 + 82). (C 2 ) 


The characteristic equation for these roots requires 

. , <I>1;<*11 + 4»lj«rl(X ; + 8|) 

«rr«f>l;(k> + 81) + <Pl ; «,l + <t>2;<*r2 + - - ^ J ^ - 

1 (C3) 

+ ^22 + d>a + &i) = () ■ j , 

tr + 82 - A, 1 

If (C2) is used to substitute in (A, + 8 i )/(A / + 8 2 ) for d>at in equation (C3), then 
the characteristic equation becomes 

(A + 8j)(A + 8 2 )(<r + 8 t - A)(o + 8 2 - A) 

— (A + 82)10' + 82 ~ A)Aj - (A + Silfir + 81 — A)A 2 = 0, 

where 

d, = —-L [(o- + 25, K, + a,)], j = 1,2, (C5) 


measures the degree of substitution or complementarity. 

Multiplying out and collecting terms yields a polynomial for the roots, A: 

A 4 - 2 crA 3 + (u 2 - 7» ~ 72 + d, + As)A 2 

(Lb) 

+ <r( 7 i + 72 - A, - As)A + (7172 - 7 id 2 - 72^1) = 0 , 

with 7, = 8,(8, + a), j = 1 , 2 , and where 7172 - 71 A% - ysAi > 0 is a 
necessary condition for the steady state to be stable. 

The roots will be complex if 

{7, - 72)* + 2 ( 7 i - 7 2 )(d 2 — Ai) + (Ai + d 2 ) 2 < 0 . (C 7 ) 

Equation (C6) implies that a necessary and sufficient condition for complex 
roots is that 


t(7i ~ 72) + (As ~ d,)] 2 + 4did 2 < 0. 


(C8) 
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Equations (C7) and (C8) together show that complex roots require the stock 
with the higher depreciation rate to have adjacent complementarity and the 
other stock to have adjacent substitution. 
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Learning by Doing and the Introduction 
of New Goods 


Nancy L. Stokey 

Northwestern University 


A dynamic general equilibrium model is developed in which goods 
are valued according to the characteristics they contain, the set of 
goods produced in any period is endogenously determined, and 
learning by doing is the force behind sustained growth. It is shown 
that the set of produced goods changes in a systematic way over time, 
with goods of higher quality entering each period and those of lower 
quality dropping out. The model is then used to study the effect of 
introducing a “traditional” sector in which there is no learning. 


I. Introduction 

Perhaps the most remarkable feature of economic growth in the de¬ 
veloped countries, especially in the period beginning with the indus¬ 
trial revolution, is the extent to which the production of goods and 
services has not merely grown but changed drastically in composition. 
Candles gave way to whale oil lamps, which in turn gave way to gas 
lights and then to incandescent bulbs. The latter have, in their turn, 
been partially displaced by fluorescent, neon, mercury-vapor, and 
sodium-vapor lights. Casual empiricism suggests that this example is 
typical rather than exceptional: many of the goods and services pro¬ 
duced today were unknown three hundred years ago, and many pro¬ 
duced then are—except through books and museums—unknown 
now. 
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By contrast, most of the aggregative models of growth and develop¬ 
ment that economists have developed to date (the work of Ramsey 
[1928], Solow [1956], Cass [1965], Koopmans [1965], and their many 
followers) concentrate almost wholly on increases in the quantities of 
goods produced. The introduction of new goods is notable by its 
absence. 1 Technical change, when it appears at all, takes the form of 
process rather than product innovation, so that "growth’’ means pro¬ 
ducing more of the same good(s). Moreover, it has proved difficult to 
construct models giving rise to sustained growth, even defined in this 
narrow sense. Exogenous technical change is one “engine” for sus¬ 
tained growth in these models (as in Solow [1959], Diamond [1965], 
Shell [1967], and many others); positive externalities in production 
are another (as in Arrow [1962], Romer [1983, 1986], and Lucas 
[1988]). 

In this paper a simple dynamic general equilibrium model is de¬ 
veloped in which competitive equilibrium paths feature sustained 
growth and in which the introduction of new and better products is 
an integral part of that growth. Specifically, the main features of the 
model are that there is a continuum of potentially producible goods; 
in each period only a limited subset of the goods are actually pro¬ 
duced; over time the set of produced goods changes, with higher- 
quality goods entering the produced set and those of lower quality 
dropping out; and in the long run growth continues without bound. 
The accumulation of knowledge, through economywide learning by 
doing, is the sole force behind the growth; there is no physical capital. 
Other features of the model are standard: labor is inelastically sup¬ 
plied, within each period all goods are produced with constant re¬ 
turns to scale technologies, and all markets are perfectly competitive. 

Thus the model is similar in several respects to those in the papers 
by Arrow, Romer, and Lucas mentioned above: there is endoge¬ 
nously generated, sustained growth in per capita output; growth is 
driven by the accumulation of knowledge; and there is an externality 
in the accumulation of knowledge. It is also like the model of Arrow 
in that the accumulation of knowledge is the result of experience in 
production rather than a separate activity (although many of the ar¬ 
guments here would also apply to models based on R & D or educa¬ 
tion). The main differences are the absence of physical capital and the 
specification of the commodity space and preferences. 

The absence of physical capital may at first seem startling. How¬ 
ever, as noted above, “growth” models built around the accumulation 

' An exception to this generalization is the model of research and development 
introduced in Judd (1985). However, that model is an explanation of product differ¬ 
entiation: it does not yield sustained growth in the long run. Schmitz (1987) looks at a 
modified version of Judd's model and studies long-run product development. 
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of physical capital alone do not give rise to sustained growth. The 
models that do are those built around the (endogenously determined) 
accumulation of knowledge or around (exogenously given) techno¬ 
logical change. The work presented here focuses entirely on the ac¬ 
cumulation of knowledge and dispenses with physical capital al¬ 
together. The benefits of this, in terms of simplicity, will be apparent; 
the costs will be discussed in the conclusions. 

The description of the commodity space and preferences are also 
unusual for a model of economic growth. Since they are central to the 
results, they need some justification. 

Why is it that (most) people in the industrialized countries no 
longer eat gruel, read by candlelight, or sleep in log cabins? The 
obvious answer is that they can afford to buy steak dinners, electric 
lights, and houses with central heating instead. They can afford these 
goods because real incomes have gone up; that is, the real cost of 
producing almost all goods has gone down. Still, why doesn’t the 
consumer eat some gruel as well as some steak, as convexity of prefer¬ 
ences suggests he should? The answer to this seems clear. Gruel is 
cheap and provides calories but otherwise does not have much to 
recommend it. Steak dinners provide a variety of vitamins, minerals, 
and protein, in addition to calories, and are much tastier as well. In 
this sense they are strictly “better" foods. Moreover, it is impossible to 
get the protein, good taste, and so forth without getting plenty of 
calories. Ihus the one thing that gruel provides is supplied in 
sufficient quantity by the better foods, so gruel is redundant. A little 
reflection suggests that similar arguments can be made in many other 
instances: a new good often replaces an old one because it does or 
provides everything the old one did, and more as well. 

This suggests that a Lancasterian (1966) characteristics model of 
commodities and preferences may be a useful framework for the 
problem at hand. The rest of the paper shows that this is indeed the 
case and is organized as follows. In Section II, specific assumptions 
are developed under which the dynamics of product introduction are 
as described above. It is also shown that such an economy will display 
sustained growth in the sense that GNP, as conventionally measured, 
will increase every period. The consequences of adding a “traditional’’ 
sector—one without learning—are explored in Section III, and the 
conclusions are discussed in Section IV. 


II. Learning by Doing and New Goods 

Assume that the economy has many identical consumers and many 
identical firms, and all markets are perfectly competitive. All consum¬ 
ers and firms are infinitely long-lived, and there is no uncertainty. 



7®4 JOURNAL OF POLITICAL ECONOMY 

There is no capital, contemporaneous labor is the only factor of pro¬ 
duction, and ail produced goods are perishable. All goods (including 
labor) are traded on spot markets in each period, and these are the 
only markets available. The consumer has a constant endowment of 
y > 0 units of labor each period, and his preferences are additively 
separable over time. 

In each period there is a continuum of potentially producible goods 
indexed by s € R + and a continuum of characteristics indexed by z E 
R+. A goods allocation in period t is represented by a piecewise con¬ 
tinuous density, x£s), 12 0. Good s provides one unit of each of the 
characteristics z € [0, j], so that the goods allocation x, contains the 
allocation of characteristics q t given by 

?<(z) = £* x,(s)ds, z a 0. (1) 

Thus higher-index goods are better in the sense that they provide 
more characteristics, and the notion of “better” or “higher quality” is 
not linked to any particular specification of preferences. For any pref¬ 
erences that are increasing in ail characteristics, additional units of 
higher-index goods are always preferred, at the margin, to units of 
lower-index goods, regardless of the initial allocation. Define 

X ~ {x: R+ -* /? + |x is piecewise continuous, and for some B 5: 0, 
x(s) = 0, j > B}, 

Q - {q: R+ —* R + \q is nonincreasing and piecewise continuously 
differentiable, and for some 5 2 0, q(z) — 0,za B). 

Then x, E X and q t E Q, all t, and (1) defines a one-to-one mapping 
between X and Q. 

For simplicity, temporarily drop the subscript t. Assume that within 
each period, the consumer’s preferences over allocations of character¬ 
istics q E Q are additively separable and symmetric: 

U{q) = f u(q(z))dz. 

Jo 

These preferences are tractable yet, given the link between goods and 
characteristics in (l), imply strong income effects. In particular, any 
good is inferior at high enough levels of income. The function u will 
be restricted as follows. 

Assumption 1. u is strictly increasing, strictly concave, and twice 
continuously differentiable, with u(0) = 0 and u'(0) < ». 

It is important that u'(0) is finite since the equilibria will involve 
zero consumption of many characteristics. 

Ail goods are produced in competitive industries, with constant 
returns to scale technologies and with contemporaneous labor as the 



learning by doing 


705 

only input. The links between periods come from the fact that pro¬ 
duction is subject to economywide learning by doing: the unit tabor 
requirement for production of any good by any firm in any period 
depends on the entire economy’s cumulative experience in production 
of all goods in all previous periods. That is, learning displays com¬ 
plete spillovers among firms and, in addition, may display spillovers 
among goods. 

Let experience in any period be described by the state variable k, an 
index of “knowledge capital,” taking values in the set K. The variable 
k may be a finite-dimensional vector, k = (k t .A*); an infinite¬ 

dimensional vector, k ~ (A], As, ...); or a real-valued function, k((j), 
£ a 0. 2 In particular, it may be the function describing cumulative 
experience, k«(j) - So x, (s), s sO. The law of motion for k will be 
discussed below. 

Within each period the technology displays constant returns to 
scale. Specifically, given k G K, the total labor required to produce any 
goods allocation x G X is Jo p(s, k )x(s)ds. The function p will be re¬ 
stricted as follows. 

Assumption 2. For each k E K, (i) />(•, k) is twice continuously 
differentiable and strictly increasing, with p( 0, k) = 0; (ii) p(-, k) is 
weakly concave on [0, m) and strictly convex on (m, ®), for some Osn 
< »; and (iii) lim^. pi{i, k) = + «. 

Part i of this assumption says that within any period the unit cost of 
production increases smoothly with the quality of the good, with the 
worthless (s » 0) good costless to produce. Since />(*, k) and q(-) are 
both differentiable, with x = ~q',\l then follows from an integration 
by parts that for any allocation x containing the characteristics q, 

f p(s, k)x(s)ds = ( pi(s, k)q(s)ds. 

Jo Jo 

Hence pi(-, k) can be interpreted as the unit cost function for charac¬ 
teristics, in the sense that the cost of producing any goods allocation is 
simply the cost of producing the characteristics it contains. Part ii of 
the assumption then says that for fixed knowledge k, the unit cost 
curve for goods is either strictly convex or weakly concave/strictly 
convex. Hence the unit cost curve for characteristics is either strictly 
increasing or “single-troughed” (where the “trough” may be a “flat”). 
Part iii says that the unit cost curve for characteristics increases with¬ 
out bound as z —► ». 


8 In general, K may be any set with a relationship “z" satisfying (i) k 2 k, all k 6 K 
(reflexive), and (ii) k 4 a kg and kg at L implies k^ z kc, all k„, k„, L 6 K (transitive). 
The relationship need not be complete. That is, there may be k, k € K, such that k ^ k 
and k at k. 
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Competitive equilibrium prices and quantities are then determined 
as follows. At the beginning of period t, knowledge k, is given. The 
assumptions of perfect competition and constant returns to scale then 
imply that all goods are priced at cost. (Since learning spills over 
completely and with no lag to other firms, it is not in the interest of 
any producer to suffer current losses in order to accelerate learning.) 
That is, with the price of labor normalized to unity, the function 
p(-, k,) describes competitive equilibrium goods prices. Equilibrium 
quantities are then determined by the preferences of the representa¬ 
tive consumer. The (as yet unspecified) law of motion for knowledge 
then determines knowledge in the subsequent period, k,+ as a func¬ 
tion of k, and x,. Therefore, given initial knowledge ko in period 0, the 
equilibrium paths for knowledge, prices, and output can be deter¬ 
mined. The goal here is to find assumptions under which only a 
limited set of goods is produced in each period, and over time lower- 
quality goods drop out of the produced set and higher-quality goods 
enter. In the context of this model, the latter will be interpreted to 
mean that equilibrium quantities {x ( }f>o have the following features: in 
each period t the set of goods actually produced is an interval [ A„ B,], 
and both {A ( } and {£,} are increasing sequences. 

First consider the determination of equilibrium quantities within 
any period. That is, consider a consumer with the preferences above 
and an endowment of labor y > 0, facing the prices p{\ k). His prob¬ 
lem is 

max [ u([ x(s)ds\dz 
x 6 xJo Viz / 

subject to [ p(s, k)x{s)ds -jisO, (2) 

Jo 


x(r) a 0, all s. 

The solution to this problem is characterized in the following lemma. 

Lemma 1. Let u and p satisfy assumptions 1 and 2, respectively. 
Then for any k E K, the solution x to (2) is unique and has the 
following form: 


where 


= o 
x(s) > 0 
= 0 

j e [o. A) 
sE [A, B] 
s E ( B , »), 

(3) 

A = max{s s: 0 |/>(j, 

k) - spi(j, k) = 0} 

(4) 


and B > A. Moreover, x is continuous on [A, if]. 
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Proof. The problem in (2) is equivalent to 


max [ u(q(z))dz 

(5) 

q e eJo 


subject to [ pi(z, k )q(z)dz - y ^ 0, 

Jo 

(6) 

q'(z) s 0, all z. 

(7) 


The feasible set for this problem is convex, and under assumption 1 
the objective function is stricdy concave. Hence the solution—if one 
exists—is unique and satisfies the first-order condition 

f u'(q(z))dz - kp(s, k) =£ 0, (8) 

Jo 

with equality if q'(s) < 0, all s. First it will be shown that for any X > 0, 
there is a unique function i|/(z, X) satisfying (7)—(8) and then that, for 
an appropriate choice of X, (6) also holds. 

Define A z 0 by (4). If />(■, k) is strictly convex, then A = 0. If p(-, k) 
is concave-convex, then A > 0 as shown in figure 1. Note that in either 
case pi{-, k) is strictly increasing on [4, *>), and p(s, k) & sp t (A, k), all 5 . 

Fix X > 0. If u'(0) < \pi(A, k), let «]/(z, X) = 0, all z. Clearly (7) and 
(8) hold. If u'(0) > kp\(A, k), define fl a A by u'(0) = Xpj(B, k); it 
follows from parts ii and iii of assumption 2 that B is well defined. 
Then define iK‘> X) by 

u'(4»(z, X)) = Xp,(z, k), z G [A, B), (9a) 

<!»(Z, x) = <}»(d, X), zG[0,A), (9b) 

»1»(Z, X) = 0, z G (B, «), (9c) 

as shown in figure 2. Note that i)ix(z, X) = 0, for z G [0, A) U (B, ®). 
Moreover, since both u' and p\(-, k) are continuously differentiable, it 
follows from (9a) that «Ji(% X) is continuously differentiable on (A,B), 
with *l*i(z, X) = kp n(z, k)/u"(*ji(z, X)). Since u is strictly concave and 
p(-, X) is—on this region—strictly convex, it follows that tjij(z, X) < 0, 
so that iM’, X) satisfies (7). 

Next consider (8). Since u'(\ji(A, X)) = kpi{A, k) and p{s, k) i 
sf>i(A, k), all s, it follows from (9b) that, for s G (0, A), 

f u'OKz, k))dz - \p(s, k) s s[u'(4»(A, X)) - Xp,(A, k)J = 0. 

Jo 
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4ence (8) holds for s £ [0, A). For s € [A, B], 
u'(i|>(z, A))dz - \p(s, k) 

0 

■ Au'WA, A)) + £ M'(»h(z, X))dz - xJ/>(A, k) + £p,(z, k)dzj 

- dt«'(«|»(A, A)) - A p { (A, k)] -f £ [u'(iK:, X)) - X/>,(z, k)]rfz 

- 0, 

t'here the second line uses (9b), the third uses (4), and the last uses 
9a). Hence (8) also holds for s £ [A, B]. Finally, with this result and 
9c), it follows that, for s 6 (B, + »), 

£ u'(t|,(z, A ))dz - \p(s, k) = £ [u’(0) - Kpi(z, k ))dz. 

rom the definition of B and the fact that />(*, k) is strictly convex in 
his region, it follows that the integrand on the right is negative, so 
hat (8) holds for s £ (B, + »). 

Next note that, for each z, t|>(z, •) is monotone in X—strictly mono- 
ane for z £ [0, B]—with lim*-.* il>(z, X) = 0 and lim x _ 0 «h(z. X) = + «. 
*ence for a unique value X*, 

( />,(z, k)»Ji(z, \*)dz = y, 

Jo 

0 that q(z) = «|i(z, X*), all z, is a solution to (6)—(8). Moreover, it is 
lear that ify > 0, then q ¥= 8, so that A<B. When q\A) and q'(B) are 
aken to be the right and left derivatives, respectively, it follows that x 
= -q' is the unique solution to (2) and has the properties claimed. 
).E.D. 

Lemma l shows that, within each period, the set of goods produced 
i competitive equilibrium is a bounded interval [A, B ]. The lower 
oundary A of the produced set is zero if p(-, k) is strictly convex and 
'therwise is determined by the tangency condition illustrated in 
igure 1. Thus in either case it is determined by properties of the unit 
ost function p(‘, k) alone. The upper bound B of the produced set 
lepends, in either case, on properties of the preferences and the 
alue of the labor endowment, as well as on the cost function. 
Lemma 1 also shows that any concave-convex unit cost function 
'(-, k) can be replaced with its greatest convexification, the greatest 
•eakly convex function that is everywhere equal to or less than p(\ k), 
ithout changing the solution to the consumer’s problem. To see this. 
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refer again to figure i. Suppose that the cost function pictured there 
is replaced by the function (not pictured) that is equal to zero at zero, 
is equal to p(\ k) on [A, <*), and is linear on the interval [0, A]. Clearly, 
at these prices the consumer cannot do better than the allocation 
chosen at prices />(', k). 

Characterizing the evolution over time of a competitive economy’s 
production of goods requires characterizing the behavior of the set 
[A, B] as knowledge increases. To do this, another assumption will be 
needed. 

Assumption 3. For any k, k e K with k < k: (i) for A, A defined in 
(4). A < A\ and (ii) p\(z, k )/p\(z, k) is not greater than unity and 
weakly decreasing in z, for z £ [0, A], and is less than unity and strictly 
decreasing in z, for z £ (A, ”). 

Part i of this assumption ensures that the lower bound of the pro¬ 
duced set shifts to the right as knowledge increases. Note that />(*, k) 
cannot be strictly convex on all of R + if k > k since this would imply 
A = 0. Part ii ensures that an increase in knowledge reduces the cost 
of every characteristic (and hence of every good) and has a relatively 
greater effect on the costs of higher-index characteristics. 

The next lemma describes how the set of produced goods changes 
as knowledge increases. 

Lemma 2. Let it satisfy assumption 1, and let p satisfy assumptions 2 
and 3. Let k, k_£ K, with k > k, and let (x, A) and (£, A) be solutions of 
(2), for k and k, respectively. Let [A, B] and [A, B] be the intervals on 
which jc and x, respectively, are positive. Then A < A and B < B. 

Proof. The first claim follows trivially from part i of assumption 3. 
Consider B and B. Let q and q be the allocations of characteristics 
corresponding to x and x, respectively. First it will be shown that 


pi(B, k) < 
pi(&, k) A ' 


( 10 ) 


Suppose the contrary. Then it follows from part ii of assumption 3 
thatp^z, k)/pi(z, k) s X/k, all z £ [0, &]. Since Api(z, k) at u'(q(z)), all 
z £ [A, +<»), and A > A, it then follows that 

«'(?(*)) = Xp,(z, k) a Api(z, k) > w'( 7 (z)), all z £ [A, B]. 

This in turn implies that q(z) s, q(z), all z £ [A, and hence that q(z) 
- <}(A) s q(A) £ q( z), all z £ [0, A). Sincepi(z, k) s, p,(z, k), all z, with 
strict inequality on (A, +«), it then follows that the budget constraint 
(6) cannot hold for both situations. Hence (10) holds, and it follows 
that 


A pi(B, k) = m'(0) = \pi(B, k) < Xpi(B, k). 
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Since />)(-, k) is strictly increasing in t for z e (A, +=»), it then follows 
that B < &. Q.E.D. 

Lemma 2 shows that under assumptions 1-3 the set of produced 
goods shifts to the right as knowledge grows. Specifically, greater 
knowledge implies that lower-index goods drop out of the produced 
set and higher-index goods enter. 

Finally, to characterize the competitive equilibrium of a multi¬ 
period economy, the dynamics of knowledge accumulation must be 
specified. Let A: K x X—*Kbe the law of motion for knowledge, k ,+1 
* h(kf, x t ). The only restriction on the function A that will be needed is 
the following. 

Assumption 4. For all k S K and ail x € X, A(k, x)2k with equality 
only if x - 0. s 

Theorem 1. Let u satisfy assumption 1, let p satisfy assumptions 2 
and 3, let A satisfy assumption 4, and let ko 6 K be given. Then the 
unique competitive equilibrium sequence of prices, allocations, and 
knowledge, {p(\ k<), *,(•)> kj^-o. for an economy beginning with knowl¬ 
edge ko in period 0, has the following properties. In each period t - 

0, 1.goods prices p(s, k,) are strictly increasing in s, only goods in 

a finite range [A„ B,] are produced, and the allocation x, is continuous 
on [A,, B,]. Over time, the sequence of price functions {/>(•, k,)} is 
strictly decreasing, and the sequences {A t }, {£,}, and {It,} are all strictly 
increasing. 

Proof. All the claims follow directly from lemmas 1 and 2 and as¬ 
sumption 4. Q.E.D. 

With the model just described, it is possible (easy, in fact) to mea¬ 
sure the rate of growth in real output, even though new goods are 
being produced every period. The reason is that unproduced goods 
in any period have a well-defined price: their unit cost of production. 
Hence it is quite simple to compare the value of output in periods t 
and t + 1, both evaluated at period t prices. Doing so gives a conven¬ 
tional measure of period-to-period growth in real GNP. The next 
theorem shows that the rate of growth, so measured, is always 
positive. 

Theorem 2. Under the assumptions of theorem 1, 

[ p{s, k,)x l+1 (s)ds > ( p(s, kt)x,(s)ds, all t. 

Jo Jo 


’ It would seem reasonable to require that A be increasing in x, for each fixed k, but 
'tis assumption is not needed for theorem 1. It would be needed to get sensible results 
i an analysis of optimal allocations, not discussed here. 
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Proof. It follows immediately from (2) and assumption 3 that 

rae> rst> 

p(s, k,)x <+ t (s)ds > p(s, k,+ i)x,+ i(s)ds 
Jo Jo 

* y * f p(s> k ,)x,(s)ds, all t. 

Jo 

Q.E.D. 

The rate of growth may be increasing, decreasing, or constant over 
time or may display more complicated behavior, depending on the 
particular assumptions made about the functions u, p, and hf 

III. Incorporating a “Traditional” Sector 

Suppose that the economy has, in addition to the “learning” sector 
described above, a “traditional” sector in which there is no learning. 
For simplicity call these sectors manufacturing and agriculture. Take 
preferences of the representative consumer to be 

v [ a, L u( ' q ^ dz \ (ii) 

where a is the quantity of agricultural goods consumed, and V is 
continuous, strictly increasing, and strictly concave. Without loss of 
generality, assume that units of agricultural goods have been defined 
so that one unit of labor produces one unit of agricultural goods. 
Then the technology is 

a + f pi(z, V)q(z)dz - y < 0. (12) 

Jo 

The assumptions of perfect competition and constant returns to 
scale imply that, with the price of labor normalized to unity, the 
competitive equilibrium price of agricultural goods is unity and the 
prices for manufactured goods are given by p(-, k). Competitive equi¬ 
librium quantities are given by the solution to the consumer’s prob¬ 
lem: maximize (11) subject to (12) and the constraints «^0 and q'(z) 
£ 0, all z. 

First it will be shown that there may be equilibrium paths that dis¬ 
play no growth and that these paths are unstable in the sense that a 

* An example in which the economy converges asymptotically to a constant rate of 
growth is available on request from the auLhor. The key features of this example are 
that experience is one-dimensional, and additional restrictions are imposed on the cost 
function and the law of motion for knowledge. These assumptions make costs and 
learning stationary when scaled to an appropriate (common) point in characteristic 
space. 
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(large enough) perturbation in the initial state sets the economy onto 
a path of sustained growth. For any U > 0 and k > 0, define £(f), k) to 
be the expenditure function for manufactured goods: 

E(U, k) * min ( p t (z, k)q(z)dz, 
q e q Jo 

subject to ( u(q(z))dz a U. 

Jo 

With E so defined, the share of total man-hours devoted to each 
sector is then given by the solution to 

max V(a, U), subject to a + E(U, k) - v s 0. 

a, u a o 


If the preferences and technology are such that V[(y, 0) - V 2 (y, 
0)/£i(0, 0), then in equilibrium an economy with no experience in 
manufacturing (ko = 6) produces no manufactured goods (U - 0). 
Such an economy remains stagnant forever (k, = 0, all t). However, if 
this economy somehow acquires enough experience to reverse that 
inequality, it then produces manufactured goods (U > 0) so that 
experience grows (k /+ i > k,). The same is then true in every subse¬ 
quent period as well. Thus there may be a dynamic competitive equi¬ 
librium that is unstable against (large enough) perturbations in the 
initial state. 

Next consider the change over time in hours devoted to agricul¬ 
ture. It follows from assumption 3 that if {k,} is strictly increasing, 
then the prices of all manufactured goods fall over time. This has two 
effects. The change in relative prices tends to decrease consumption 
of agricultural goods, but the increase in real income tends—assum¬ 
ing that agricultural goods are “normal”—to increase the quantity 
consumed. The net effect is the sum of these substitution and income 
effects, and either may predominate. This statement can be made 
precise by studying the market and compensated demand functions 
for agricultural goods. Since the prices of manufactured goods de¬ 
pend on knowledge, in this context both demand functions will have k 
as an argument instead of the usual vector of goods prices. For sim¬ 
plicity let A be a scalar. 

It is useful first to define the indirect utility function U by 


0(e, k ) = max u(q(z))dz 
q e qJo 


subject to p i(z, k)q(z)dz -esO. 

Jo 


(13) 


Call 0(e, k) the “felicity" attainable from manufactured goods when 
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total expenditure on those goods is e and prices are p{-> k). It is im¬ 
mediate that since u is strictly increasing and strictly concave, 0 is 
strictly increasing and strictly concave in its first argument. 

With (j so defined, consider the two problems 

max V[a, 0(y - a, *)] (14) 


and 


min y subject to V[a, U(y - a, A)] = i>. (15) 

a, y 


Since V is strictly concave and 0 is strictly concave in its first argu¬ 
ment, both have unique solutions; call them a(k, y) and [a r (k, v), y c (k, 
v)]. The functions a and a c are the market and compensated demand 
functions for agricultural goods. 

Assume that (14) and (15) have interior solutions, 0 < a < y . Then a 
and a c are characterized by the appropriate first-order conditions 
and, in the case of a c , by the utility constraint. Differentiating these 
conditions, one finds that 


da _ f (J 2 \ da da r 
dk ~ [ftj dy dk' 


(16) 


Thus the effect on the demand for agricultural goods of a change in 
knowledge (and hence a change in manufactured goods prices) can be 
decomposed into an income effect and a substitution effect. It is te¬ 
dious but straightforward to show that, as usual, the former is of 
ambiguous sign and the latter is negative. 


IV. Conclusions 

Several specific features of the technology and preferences are impor¬ 
tant for obtaining the results in theorem 1. First, it is important that 
learning display spillovers among goods. Otherwise, learning simply 
reinforces existing patterns of production, which works against both 
the introduction of new goods and the discontinuation of old ones. 
Krugman (1985) has explored such a technology, with a fixed, 
bounded set of goods, in the context of international trade. The con¬ 
clusion there is that once an international pattern of specialization is 
established, it persists. Because each country learns only about the 
goods it has produced itself, the initial pattern of comparative advan¬ 
tage is simply exacerbated as production occurs. Similar conclusions 
can be expected in a closed economy. 

Second, it is important that “forward" spillovers be stronger than 
“backward” spillovers. This is the basic content of assumption 3, 
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which is similar in spirit to the restriction made in Wan’s (1975) model 
of learning. An assumption of this sort is needed to ensure that new 
goods are introduced. 

Finally, the characteristics model of preferences provides an analyt¬ 
ically tractable framework for introducing interactions among goods. 
Specifically, it allows one to retain the simplicity of additive separabil¬ 
ity, without some of its drawbacks. Preferences that are additively 
separable over goods are not particularly well suited to obtaining the 
type of results in theorem 1. The reason is that they imply a prefer¬ 
ence for diversity in the goods consumed, which is then a strong force 
against abandoning the production of any good. Income effects and/ 
or changes in relative costs can offset this force, but joint restrictions 
on the technology and preferences are then needed to ensure that the 
latter are strong enough to produce the desired conclusions. 

An unusual feature of the model above is the absence of physical 
capital. This implies, of course, that the model can say nothing about 
long-run rates of investment, rates of return on capital, and so forth. 
However, physical capital could be incorporated in a variety of ways. 
For example, one could add a capital goods sector that produces a 
homogeneous output with an unchanging technology. The output of 
this sector would be combined with labor, and the resulting "aggre¬ 
gate physical input” used as a factor of production in both the con¬ 
sumption goods and capital goods sectors. One would then be able to 
study questions about long-run rate of investment and so forth. How¬ 
ever, it seems unlikely that the results in theorems 1 and 2 would be 
changed. Thus the omission of physical capital limits the scope of the 
model but seems unlikely to change the basic conclusions. 

Research and development, also absent here, provides another 
source for sustained growth through the introduction of new goods. 
However, R 8c D could, at least in principle, also be incorporated. The 
results in theorems 1 and 2 will hold whenever preferences and unit 
costs satisfy assumptions 1 and 2, and one or more factors cause the 
unit cost function to change over time as described in assumption 3. 
The factor affecting unit costs might be R 8c D or firm-specific learn¬ 
ing by doing instead of or in addition to the economy wide learning by 
doing described here. However, the imperfectly competitive markets 
and dynamic incentive problems that R 8c D or firm-specific learning 
emails will make the model very much harder to analyze. 

Notice, too, that in some situations R & D and learning by doing are 
hard to distinguish, as in Wan (1975). It is not accurate simply to view 
improvements in technology as attributable to R 8c D if they involve a 
cost and to learning by doing if they do not. In a learning-by-doing 
model the relevant cost is an opportunity cost. It is therefore a little 
less obvious, but certainly no less real. The model above is typical in 
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this respect. The agents there face a trade-off each period between 
current utility and the benefits of future cost reduction. Current pro¬ 
duction can serve either purpose or both. From a firm’s point of view, 
the opportunity cost of faster learning is lower current profits. Hence 
for firms in competitive markets, the choice is quite simple. Since 
future cost reduction is a pure public good, while the costs are com¬ 
pletely internal to the firm, the benefits of learning receive no weight 
in any firm’s production decisions. 5 

Finally, notice that the model above might also be viewed as repre¬ 
senting a sector—food, clothing, transportation, and so forth—with 
an entire economy then composed of several such sectors, as in 
Clemhout and Wan (1970). Would such a multidimensional extension 
display the same qualitative properties? It is difficult to say. The one¬ 
dimensional model here has the property that goods that are close in 
terms of consumption are also close in terms of production require¬ 
ments. A multidimensional model would make such an assumption 
more problematic. 
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Housing Investment in the United States 


Robert Topel and Sherwin Rosen 

Unnitnity of Chicago 


A supply-determined model of housing investment is estimated 
from quarterly data over the 1963-83 period. The model is built on 
dynamic marginal cost pricing considerations and allows short- and 
long-run supply elasticities to differ. These are estimated as 1.0 and 
3.0, respectively, but most of the long-run response occurs within 1 
year. Rapid adjustment speed and the sizable long-run elasticity of 
supply are important factors in understanding the volatility of hous¬ 
ing investment. The data also suggest some anomalies in the ex¬ 
pected present value theory of asset pricing for housing capital. 


I. Introduction and Summary 

The housing market is an attractive candidate for studying invest¬ 
ment behavior because housing construction is highly volatile and the 
data are among the best available. Market prices of capital are directly 
observed in the housing sector, and available price data are adjusted 
for quality change. We develop and estimate a supply-determined 
model of investment in single-family housing, in which short-run sup¬ 
ply is less elastic than long-run supply. 

The next section presents informal evidence suggesting that cyclical 
movements in housing construction are driven largely by demand 
fluctuations along a rising supply curve of new homes. Factor prices 
are positively correlated with the level of new construction, and con¬ 
struction itself is positively correlated with the relative price of hous¬ 
ing. Rising supply price of investment is the focus of the theoretical 
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Fig. 1.—Annual time-series data 


model developed in Section III. This model consists of three equa¬ 
tions: a new housing supply decision built up from dynamic marginal 
cost pricing considerations, the flow demand for housing services, 
and the expected present value theory of asset pricing. 

We present empirical estimates in Section IV using quarterly data 
over the 1963-84 period. The long-run supply elasticity of new hous¬ 
ing is 3.0 and the short-run (one-quarter) elasticity is 1.0. Most of the 
difference between long-run and short-run supply vanishes within 1 
year, implying that resources are highly mobile between the single¬ 
family housing investment sector and other sectors of the economy. 
Attempts to estimate demand and intertemporal price arbitrage con- 
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ditions are less successful because of data limitations and because no 
exogenous variables can explain the sustained rise in real housing 
prices between 1974 and 1979, when after-tax implicit rental prices 
were negative. We believe that the market excessively discounted ex¬ 
pected capital gains over this period, but ex post instrumental variable 
predictors track realized prices too closely to prove it. 

II. Cost and Construction Activity 

Figure 1 shows some of the data in annualized form to exhibit the 
cyclical patterns most clearly (quarterly data are used in the empirical 
work). The main series to be explained is housing starts. They took a 
gentle downward course in the 1963-70 period, with a few wiggles 
associated with the 1967 and 1970 recessions. The next episode was a 
boom in 1970-73 followed by a decline of equal magnitude in 1973- 
75. A more substantial boom and bust occurred during the 1975-82 
period, with a peak in activity in 1977-78. The peak-to-trough ratio 
of building activity in the second and third episodes is approximately 
2.0: an expansion doubles the output of new homes and a contraction 
cuts it in half. Quarterly starts data (not shown) exhibit enormous 
seasonal variations. Summertime construction activity is twice as large 
as in winter, so seasonals rival cyclical variations in amplitude. Since 
the industry is highly volatile and large flows of resources move in or 
out within a few quarters, we expect to find a large supply elasticity. 

The relative price of new homes refers to a hedonically adjusted 
“house of 1977 characteristics” deflated by the consumer price index 
(excluding the shelter component). Visually comparing prices and 
starts suggests that price movements and construction activity are 
positively correlated, though construction activity appears to turn 
down prior to the downturn in prices. Apart from that detail of 
timing, this observation suggests a rising supply price of new homes. 
The Boeckh index of real residential construction costs lends addi¬ 
tional support for the hypothesis of a rising supply price. Construc¬ 
tion costs as a whole closely match the movements in housing prices 
and in construction activity. Examination of the individual cost com¬ 
ponents (not shown) reveals similar comovements. The market for 
construction labor exhibits high unemployment rates, a high level of 
job turnover, and the most volatile employment patterns of any in¬ 
dustry (Topel and Ward 1987). Hourly wage rates and employment 
of construction labor closely follow the price and output series. Real 
lumber prices and lumber consumption, a major materials compo¬ 
nent of house construction, also closely track house prices and new 
construction. That wage rates and prices of building materials are 
positively correlated with factor utilization and with the price of new 
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houses is consistent with a rising supply price of factors of production 
to the construction sector. In sum, the raw data are consistent with a 
rising supply curve of new houses traced out by shifts in demand. 

III. Investment and the Housing Market 

Most empirical studies of housing investment have followed the stock- 
adjustment demand model of Muth (I960, 1981). Later work by 
Kearl (1979) and Poterba (1984) views investment as determined by 
supply conditions (Witte 1963). The basic idea of the supply theory is 
easily stated. Assume that asset prices clear the stock market so that 
existing stock is willingly held by the public. Then desired stock is 
identical to actual stock and the “demand for investment” is not 
defined. Since investment is a small fraction of existing stock, any new 
units appearing on the market can be sold at existing prices. The 
number of new units forthcoming at any time then depends on the 
level of market prices relative to marginal costs of decentralized con¬ 
struction firms and developers. The number of new homes produced 
is a point on the construction supply curve. 

The relation between this decentralized market framework and 
adjustment cost theory has been clarified by Mussa (1977). For the 
economy as a whole, external adjustment costs amount to rising sup¬ 
ply price; increasing marginal cost is equivalent to external adjust¬ 
ment costs. The production possibilities curve between the output of 
investment goods and that of all other goods is concave because dif¬ 
ferent industries use different factor proportions. Similarly, Abel 
(1980) and Hayashi (1982) have clarified the connections between 
Tobin’s (1969) Q theory of investment and adjustment cost theory. 
The marginal cost of construction equals the marginal value of addi¬ 
tional stock (its price) in adjustment cost theory, whereas the averages 
are proportional to each other in (? theory. In either case investment 
is determined by the intersection of an infinitely elastic “demand 
curve” with an investment supply curve, so rising supply price is nec¬ 
essary for investment to be finite in any period. Investment is spread 
over an extended interval of time because it is too expensive to do it 
all at once. 

The simplicity of these models arises because investment decisions 
are myopically determined by comparing current asset prices with cur¬ 
rent marginal costs of production. Current asset prices are “sufficient 
statistics" for investment. However (see Kydland and Prescott 1982), 
the sufficiency of current prices for investment rests on the assump¬ 
tion that short- and long-run investment supply coincide. If short-run 
supply is less elastic than long-run supply because it takes some time 
to move factors of production between industries, then the current 
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price is no longer sufficient for investment decisions. Builders must 
form expectations of future prices in choosing current construction. 
Our model incorporates this Marshallian distinction between short- 
run and long-run supply by superimposing an internal adjustment 
cost mechanism on the representative construction firm. The long- 
run production possibilities curve between investment and other 
goods can be thought of as the outer envelope of a family of short-run 
curves, with the envelope exhibiting less curvature and greater supply 
elasticity than any of the subsets from which it is formed. 1 

The genera) model consists of three relationships: supply, demand, 
and expectational linkages between stock and flow prices. For exposi¬ 
tory convenience, the discussion in this section is set up in terms of a 
continuous-time, nonstochastic model, though the empirical work 
uses a discrete-time, stochastic framework. 

A. Supply of New Homes 

A complete model of the dynamics of new housing supply requires 
detailed specification of supply dynamics for all factors of production 
to the industry. We cut through these immense complications and 
approximately incorporate dynamic factor supply conditions into in¬ 
dustry supply by allowing marginal cost to vary with both the level of 
output and its rate of change; that is, “internal" adjustment costs are 
superimposed on the rising long-run supply price for the representa¬ 
tive construction firm. Short-run supply is more elastic than long-run 
supply because rapid changes in the level of construction activity are 
penalized by higher costs. 

Specify the industry cost function as 

C = C(I, /, y), (1) 

where C is total cost corresponding to gross investment level /, / is the 
rate of change of gross investment, and y is a vector of variables that 
shift the cost function: the level of factor prices for those factors that 
are elastically supplied to the industry and factor supply shifters for 
those that are supplied less elastically. Gross housing investment is the 


1 A theorem of Ben veniste and Scheinkman (1979) implies that the marginal value of 
a unit of capital is the gradient of a value function. These marginal values generally 
must be estimated from stock and bond market data but are the directly observed 
housing prices in this case. Current Q is sufficient to describe value if short- and long- 
run supply are identical, but past as well as current values of Q are necessary if short- 
and long-run supply differ. Summers's (1981) finding that current and past values of Q 
affect manufacturing investment is consistent with the model presented here. Chirinko 
(1986) discusses other aspects of these theories. 
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output of the construction industry, defined in the usual way: 

I = K + bK, (2) 

assuming exponential depreciation at rate 3. The following properties 
are imposed on the cost function (1). First, C\ - dC/dl > 0 and Cn = 
d 2 C/dr > 0: marginal cost is positive and increasing in output. Sec¬ 
ond, C 2 = dC/dl a 0 and C 2 2 = d 2 C/di 2 2s 0: there is a nondecreasing 
cost penalty for changing the level of output. 

In making its supply decision, the representative firm chooses 1 and 
/ to maximize expected discounted value. With P(t) written for the 
competitively determined (stock) price of a standard unit of housing 
at time t, the firm maximizes 

r imm - cm, m, ym*~ n dt, 

JO 

where r is the rate of interest. The Euler condition is 

*> - ■ Kir) - ^ «> 

Equation (4) nests the myopic supply case within a more general 
framework of differences between long- and short-run supplies. For 
if dC/di ~ 0, construction activity is determined by equating marginal 
cost to market price for all t, and current price alone is sufficient for 
supply. However, equation (4) shows that costs of changing output 
impose a wedge between price and marginal cost. The wedge causes 
long-run supply to be more elastic than short-run supply. 

To illustrate these points, linearize the cost terms in equation (4). 
With operator notation DZ = dZ/dt and so forth, equation (4) be¬ 
comes 

(1 + r(JD - fJ0 2 )/(I) = (^-)P«) - (£)[c, + rc 2 + c, s y«)]. (5) 

where the terms in c; and are derivatives of the cost function evalu¬ 
ated at a stationary point, and fi = c 2 a/(ci 1 + rc 2 |). The crucial param¬ 
eter C 22 is the second derivative of costs with respect to /, so equation 
(5) illustrates a well-known result that adjustment costs must be in¬ 
creasing to have any consequences. 2 

2 The quadratic approximation imposes symmetry in costs for changes in both direc¬ 
tions. If expansions are capacity constrained by availability of skilled labor or by fixed 
capital requirements of materials suppliers, then costs are not symmetric with respect to 
expansions and contractions. These refinements are not pursued here. We assume for 
simplicity that Cgg = 0 (marginal adjustment costs are independent of supply shifters), 
though these can be easily reincorporated without affecting the subsequent analysis. 
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The partial equilibrium path of investment supply is found as fol¬ 
lows: Define 6(0 as the right-hand side of equation (5), a linear func¬ 
tion of P(t) and y(t). Dividing through by 3 results in 

(D 2 - rD - ±jl(t) — (D — Xj)(D - X 8 )/(t) = ( 6 ) 

where Xj and X 2 are the roots of the characteristic equation X 2 - rX - 
(1/fi) * 0. Both roots are real; one is negative (Xi), and one is positive 
(X 2 ). The solution (this is a partial equilibrium solution because P(t) is 
endogenous in the full market equilibrium) to equation ( 6 ) takes the 
unstable positive root forward and the stable (negative) root backward 
(Sargent 1979). Performing these operations results in 



where 1 0 = /(0) is the initial condition for the problem. Equation (7) 
describes the distributed lag and lead responses of investment to the 
forcing function 8(/). The exponential weights on 0(t) in the last two 
integrals are declining in both directions so that forcing data affect 
current investment I{t) through a backward and forward exponential 
“window.” The weighting functions are concentrated on current data 
6(<), and supply decisions become myopic as 3 approaches zero be¬ 
cause limp_o |X,| = °°. This occurs when c 22 approaches zero, from the 
definition of 3 . 

A partial equilibrium conceptual experiment illustrates the distinc¬ 
tion between short- and long-run supply. Start from a situation in 
which 0 (/) has been constant at a value 0 i and investment has settled 
down to its long-run level I(t) - I\ = 0 *, a point on the long-run 
supply curve. Take this as an initial condition in (7) and suppose that 
the price of housing takes an unexpected jump to P 2 , where it remains 
thereafter. Then 0(i) jumps from 0 t to some higher value 0 2 and /(/) 
converges asymptotically to a new long-run value of / 2 = 82 . also on 
the long-run supply curve. Substituting 0(1) = 0 2 into (7) yields the 
path by which I(t) travels from /1 to / 2 : 

/(*) = 0 2 - (0 2 - 0,)e M . (8) 

The exponential response is largest at the beginning and smallest at 
the end. Manipulations of equation ( 8 ) lead to the flexible accelerator: 
/(/) = -Xi [/ 2 - /(/)], where / 2 is the target to which /(/) converges 
when P{t) = P 2 . This form is familiar from early discussions of adjust¬ 
ment cost models (Eisner and Strotz 1963; Lucas 1967; Gould 1968; 
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Treadway 1969). This experiment depicts an evolving supply curve if 
various short runs are identified with specific intervals of time and the 
long run with an arbitrarily long interval. The long-run supply curve 
connects the points (7i, Pj) and (7 2 , P 2 ) in the investment-price plane. 
Short-run supply curves are spun out of the point (I u Pi) and are less 
elastic than long-run supply, with the elasticity increasing as time goes 
by. 

Equation (7) also shows how supply responds to temporary changes 
in price. Suppose that price unexpectedly rises from Pj to P 2 for a 
finite interval of time T, after which it returns to Pi. The price distur¬ 
bance is “more permanent” the larger is T and is “more transitory” 
the smaller is T. Differentiating (7) with respect to t and evaluating at 
/ * 0 yields an expression for the initial (impact) response: 

7(0) = X,0, + |* (9) 

For the postulated square wave pulse in P(t) this becomes 

7(0) = -X,(e 2 - 0,)(1 - e~ K * T ), (10) 

which is increasing in T, that is, the more permanent the pulse. How 
long must the pulse in P(t) last for the impact response to be m per¬ 
cent of the impact response to a permanent change in price? From 
equation (10) the pulse must have length T* - -ln(l - wt)/X 2 ; T* is 
decreasing in X 2 or increasing in c 22 , another way of saying that differ¬ 
ences between short- and long-run responses to price changes vanish 
as internal adjustment costs get small. 

Specification (4) or (5) has more than academic interest. We were 
led to it because the simpler model does not allow short-run/long-run 
differences in supply. It also fits the data better. The cost of this 
generality, as is clear from (7), is that current P(I) no longer incorpo¬ 
rates all current and future information that is relevant for invest¬ 
ment decisions. Expectations of future asset prices affect current sup¬ 
ply. 

B. Demand, Expectations, and Market Equilibrium 

The supply function is the main focus of this study but is only one 
element of a structural model of the overall market. To understand 
how all these elements interact in determining market dynamics, it is 
helpful to oudine the larger model. 

Consider the simplest possible demand specification in which fric¬ 
tions generated by heterogeneity of units and the matching of buyers 
and sellers are ignored. In particular, assume that housing units can 
be measured on a homogeneous scale through use of a hedonic index, 
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that market transactions can be treated as if they occur in a frictionless 
auction, and that the capital market is perfect. Let K, denote the stock 
of housing capital and assume a proportional service flow. Then the 
inverse demand for housing services can be written as 

R * aK + x, (11) 

where R is the implicit rental price of a unit of housing services, x(t) is 
a vector of exogenous demand shifters, and a < 0. 

Connections between stock prices and expected future rental prices 
complete the model. The rational expectations hypothesis (perfect 
foresight in this deterministic model) is used here. When taxes are 
ignored to simplify, the rental price of a house is its amortized stock 
price including allowances for interest, depreciation, and capital 
gains, or 

R = (r + h)P - P, (12) 

where r is the rate of interest. The value of the housing stock must be 
bounded so that x(t) and y(t) cannot grow too fast and the discounted 
future price of capital converges: 

lim P(t)e~ {r+h)l - 0. (13) 

/—POO 

Integrating (12) and using boundary condition (13) yields the familiar 
asset pricing equation: 

P(t) = |“/?(s)<'- (r+S)(s -' , ds. (14) 

The price of a house is its discounted future market equilibrium 
rental. 3 

When (12) is substituted into (I I) and (5) is rewritten in the obvious 
notation, the complete market dynamics of stocks and prices are de¬ 
scribed by two linear differential equations: 

(1 + rfi.Z) - fl t Z) 2 )/(/) = Po + + y(0, (15) 

(r + 8 )P(t) - P(t) - aK(t) + x(t), (16) 

along with the connection between /(<) and K(t) in equation (2), initial 
conditions for K( 0) and 1(0), and terminal condition (13). Differ¬ 
entiating (16) with respect to t and substituting from (2) yields 

(1 + rBD - BD 2 )P(t) = a BI(t) + B(D + 8)x(I), (17) 

where B - [8(8 + r)]~ l . 

5 With the method of Lucas (1981), it is readily shown that this model is the decen¬ 
tralized market equivalent of a social planning problem that maximizes discounted 
consumer and producer surplus. Rationality in the sense of (14) is necessary for 
efficiency, that is, for market prices to reflect the true social value of additional capital. 
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Analysis of this system reveals the partial equilibrium nature of the 
discussion surrounding equations (5)-(10) above. In the special my¬ 
opic case in which Pi = 0, (15) and (16) are the familiar second-order 
system analyzed by Sheffrin (1983) and Poterba (1984). Dynamics are 
easily analyzed using phase-plane methods (Abel 1982; Drazen 1985; 
Judd 1985). It pays to build ahead of anticipated demand when there 
is rising supply price in order to distribute costs over an extended 
interval of time. For instance, an anticipated transitory increase in 
future demand causes bubble-like price and investment responses: 
House prices increase immediately in a rational market, and this sig¬ 
nals increased construction activities prior to the time the change 
occurs. Rental prices fall during this phase because of accumulating 
stock. At the point at which demand actually jumps up, rational 
agents anticipate its transitory nature, so price starts falling and con¬ 
struction turns around. After the shock has passed, the housing stock 
is too large and must be worked down to its steady-state level. Further 
reductions in price reduce investment below steady-state values, while 
price and investment gradually rise back to steady-state levels. Rising 
supply price spreads investment and price responses both backward 
and forward from the time anticipated shocks occur. 

The generalized model in which Pi ft 0 is (15) and (17). This is a 
fourth-order system in P(t) and I(t) and cannot be analyzed in the 
phase plane. Nonetheless, its solution is qualitatively similar to the 
simpler model. Now the incentives to spread adjustments over an 
extended interval of time are even larger because of the extra penalty 
of internal adjustment costs. Responses are more sluggish than when 
Pi = 0 for this reason. The characteristic equation for system (15) and 
(17) can have complex roots, however, something that cannot happen 
when Pi = 0. This leads to damped sinusoidal distributed lagged 
responses of price and investment to pulses in x(t) and y(t) and occurs 
when demand for housing services is very inelastic. Space limitations 
preclude analyzing full system dynamics here. 


IV. Estimation 

A. Supply 

The supply function is estimated with quarterly time-series data on 
U.S. housing starts over the 1963:1-1983: IV period. The empirical 
form of the myopic (Pi = 0) supply model (4) is 

// = Po + P 2 Pt + Pay* + v„ (18) 

where I, denotes new single-family housing units started during quar¬ 
ter t, P, is the (real) hedonic price index for 1977-quality homes, and y, 
is a vector of cost shifters. Unobserved cost shifters account for v„ and 
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these are assumed to be orthogonal to observable supply and demand 
shifters. Summary statistics for variables entering (18) are reported in 
the last row of table 1. Data sources and definitions of variables are 
found in the Appendix. 

Several alternative specifications are shown in table 1. The first four 
rows ignore any autoregressive structure in the residuals, and the last 
two assume an AR(2) process. The estimation method is instrumental 
variables using current and lagged exogenous variables as instru¬ 
ments because of the endogeneity ofJV In practice, the first-stage 
instrumenting equation has a large /7 s . “This “overfitting” means that 
the point estimates differ litde from least squares. Still, endogeneity is 
unlikely to be a serious problem because investment is such a small 
fraction of existing stock. Seasonally unadjusted data are used, in¬ 
cluding seasonal dummies in the regression (not shown), plus another 
dummy for the severe winter of 1979. The real price index includes 
the value of the site plus structure, though similar results were ob¬ 
tained using the structure price alone. 

The first column of table 1 indicates positive supply responses to 
changes in the price of housing. The implied supply elasticity at sam¬ 
ple means ranges between 1.4 and 2.2 and is not sensitive to the 
specification of the error process. 4 

Our initial specifications of supply included real interest rates as 
cost shifters, meant to reflect the cost of working capital to builders. 
The magnitude of the effect of interest rates on new investment sug¬ 
gests that more is involved, however. We find a strong response of 
housing starts to changes in both the real rate of interest and expected 
inflation, and the hypothesis that nominal rates of interest affect 
housing investment cannot be rejected. The reported specifications 
include both the ex ante real rate of interest and the expected 3- 
month rate of price inflation. Both have similar negative effects on 
construction. The estimates in row 5 imply that a one-point increase 
in either the annual real rate of interest or the expected rate of infla¬ 
tion reduces new construction by about 8.0 percent. These effects are 
too large to be generated by changes in the cost of capital to builders. 
When the model includes both current and lagged effects of these 


4 There may be selection problems in the price data because they are constructed 
from actual transactions. For example, if there are no sales in a particular location in 
some quarter, that location gets no weight in the price index. This is not a serious 
problem for estimating an aggregate supply function because it approximates the ap¬ 
propriate "marginal” concept. However, it may affect more detailed inferences con¬ 
cerning timing and lags. Also, approximately 25 percent of units are built on contract 
and the rest for the market at large. This introduces some noise, for our purposes, in 
linking starts with prices on a quarter-to-quarter basis. Experiments with one-quarter 
leads and lags of prices and with two-quarter price averages revealed that the estimates 
of supply parameters are insensitive to these refinements. 
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variables, both have similar statistically significant negative effects on 
current supply. 

Sensitivity of housing construction to interest rates is well known 
(e.g„ Muth 1981) but is surprising because all demand-side effects 
should be embodied in asset prices in an ideal market. There are 
reasons why nominal interest rate changes affect housing demand; 
for example, higher nominal rates increase current real interest pay¬ 
ments on fixed-rate mortgage loans (Kearl 1979). Another possibility 
for demand-side effects is that credit was rationed during the sample 
period (Poterba 1984). Effects of either kind should reduce invest¬ 
ment by causing prices to fall. They should have no independent 
direct effects on supply. Perhaps the lag structure of model (18) is too 
simple, and changes in current interest rates signal changes in future 
asset prices. Alternatively, fluctuations in the nominal rate may signal 
changes in the ability to sell new homes at the current price. 

This last interpretation is supported by the finding that time to sale 
has a large effect on new construction. The Months variable in table 1 
is the median time on the market for new houses for sale in quarter t. 
Sales delay entails forgone interest costs to the builder and can be 
incorporated by discounting the price to reflect expected waiting time 
to sale (Poterba 1984). However, table 1 shows that delay effects are 
much too large to be interpreted as forgone interest costs alone. 5 The 
incremental cost of a 1-month increase in time to sale surely is less 
than 1 percent of the price because it is just 1 month’s interest. A 
supply price elasticity of 2.0 implies an effect of less than 2 percent, 
yet the direct estimates show that an additional month’s delay reduces 
investment by 30 percent. Similarly, the typical house is on the market 
for 2 months prior to sale, so a one-point increase in the real rate 
increases a builder’s cost by 0.2 percent, yet the directly estimated 
effect is 8 percent. 6 These findings suggest that a pure auction model 
of trade in homogeneous units does not completely describe the hous¬ 
ing market, even for aggregate time-series analysis. 

We have experimented with including the Boeckh index of con¬ 
struction input costs, the manufacturing wage, and the average wage 
of construction workers as cost shifters. None had important effects. 
For example, row 4 reports estimates that control for the hourly wage 
of construction workers. After instrumenting to account for rising 

* We have considered the case in which Months is endogenous. When this variable is 
instrumented, the results differ trivially from those reported here. 

6 These effects are much larger than Poterba’s (1984) constrained estimates, though 
the specification in table 1 is otherwise similar to his, except for the trend term. Drop¬ 
ping Trend from the investment equation increases the supply elasticity by 20 percent; 
it is included to allow for technical change in the industry and because of the marked 
trend in prices apparent in fig. 1. 
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supply price of labor to the industry, we find no evidence that wage 
fluctuations were exogenous cost shifters. Rather, they are endoge¬ 
nously determined from shifts in the derived demand for construc¬ 
tion labor. Finally, the last rows report variants of the myopic supply 
model when the errors follow an AR(2) process. The main results are 
not affected. However, the statistical significance of the autoregres¬ 
sive structure suggests misspecification of supply dynamics. We turn 
to the dynamically enriched model next. 

The discrete-time stochastic analogue of the Euler condition (5) or 
(15) is 

U - 0o + 1 + 02 Pt + 0»y< + v,, (19) 

where E, denotes expectation given period t information, a is a dis¬ 
count factor, and 0 i > 0 reflects internal adjustment costs (c 22 above). 
If 0 i is significantly positive, then long-run supply is more elastic than 
short-run supply. Of course, 0 2 J& 0 and 0s < 0. The appearance of 
/,_1 and El t+ ! in (19) adds econometric complications. We continue to 
assume that the error term represents unobserved cost shifters. The 
expectation is unobserved, and, as before, P t is endogenous. To esti¬ 
mate (19) replace E,I,+ i with its realization l,+ \: 

It - 0o + 0iA-i + a0i/<+i + 02 Pt + 0sy< + v t - o0i€ l+ i, (20) 

where e ,+1 = I t +i - E,I tJr \ is orthogonal to period t information 
under rational expectations; /,+ j is endogenous and correlated with 
the composite error term. We assume that£(x,_ jV,) - E(y t -jV t ) - Oat 
all lags j, so that lagged supply and demand shifters are valid instru¬ 
ments for I l+ 1 and P t . Two sets of estimates are reported in table 2, 
depending on the v, process. 

First, if v, follows an arbitrary time-series process, then lagged en¬ 
dogenous variables are also correlated with the error, so consistent 
parameter estimates are obtained by using current and lagged values 
of exogenous variables as instruments. With the composite errors in 
( 20 ) denoted by = v t — a 0 it,+. i, the error covariance at lag 1 is 

E(Vti\t-i) - E{v,v,-i) - a0i£(v,e,). (21) 

Innovations in v t are components of the forecast error c,, so (21) is 
nonzero (Hansen 1982). Since E(vfi t ) is positive, n, is negatively auto- 
correlated at lag 1, even if v, is white noise, if 0| > 0. If v, is serially 
correlated as well, the negative correlation in T|, persists at higher lags. 
If v t is AR(1) with parameter p., then 

£(ifc%_ 7 ) = pAr w - 'Efrn,- ]). (22) 

In calculations of standard errors for the instrumental variable esti¬ 
mates of ( 20 ), the errors are allowed to follow ( 22 ), where a consistent 
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estimate of p in (22) is used to form the error covariance matrix in 

( 20 ). 

Second, if v t truly is AR(1), it is appropriate to quasi-difference (20) 
(see Cumby, Huizinga, and Obstfeld 1983). The model becomes 

/, = (1 + liafi iP'CPoO ~ !*■) + (p> + Pi)//— i ~ + aPiA+i 

+ p 2 (P t - yJ’t-i) + Ps(y< - H-yt-i) + Hr - MHi-il. (23) 

~ ~ u, - api«/+ * + HoPie,, (24) 

where u, is white noise. Equation (23) can be estimated by instrumen¬ 
tal variables and imposing the nonlinear restrictions across parame¬ 
ters. 7 We report estimates of the parameters of (20) in both dif¬ 
ferenced (eq. [23]) and nondifferenced form. Estimates based on 
higher autoregressive processes did not differ from those reported 
here. 

Table 2 reports the estimates. Rows 1-4 are based on equation (20), 
using only lagged supply and demand shifters as instruments for 
investment and price. In all specifications, the error covariance at lag 
1 is negative, as implied by (21) if the covariance between v and e is 
large and Pi > 0. This does not mean that the “true” errors, v„ are 
negatively serially correlated: the estimated autoregressive parameter 
for v, is always positive, a plausible result if v, represents unobserved 
cost shifters. The quasi-differenced form in rows 5 and 6 produces a 
slightly smaller autoregressive parameter, though still positive. 

The main result in table 2 is that the time-invariant rising supply 
price model of table 1 is rejected: the estimated internal adjustment 
cost parameter is numerically large and always more than triple its 
estimated standard error. The estimates of Pi are found in the second 
column and were obtained by constraining the coefficients of /,_ i and 
E,l t + { to differ by an assumed discount factor of a = .98. This restric¬ 
tion is not rejected in any form of the model. Estimates of Pi and 
other parameters are insensitive to choice of a in the neighborhood of 
.98, and when the restriction is not imposed, the point estimates for 
independent coefficients on /,_i and EI t + i are nearly identical to the 
reported values of pi. Estimated adjustment costs are slightly smaller 
in the quasi-differenced form in rows 5 and 6, but the fundamental 
finding is not affected: There are differences in the response of cur¬ 
rent investment to permanent and transitory changes in price, as well 
as differences between short- and long-run production adjustments. 

7 The appearance of the forecast error «, in (24) means that current demand and 
supply shifters arc not exogenous. These variables are in the information set at t, and 
they are components of e ( . Thus y, also must be instrumented. On the other hand, lags 
of investment and price are valid instruments under this assumption, so some trade-off 
in efficiency is involved. 
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The estimated adjustment cost effects are somewhat sensitive to 
specification: the model in row 1 implies large adjustment costs (and 
small current investment response to current price changes), but in¬ 
cluding lagged interest rates and median time on the market substan¬ 
tially increases the immediate response of investment to price. As in 
table 1, both median time to sale and interest rates have strong nega¬ 
tive effects on current investment, and the hypothesis that decisions 
are driven by the nominal rate of interest still cannot be rejected in 
any form of the model. 8 This suggests that their significance is not 
due to dynamic misspecification but rather to a conceptual inade¬ 
quacy of treating the housing market as if it were a homogeneous 
auction market. If we exclude row 1, the long-run effects of price on 
investment are not much different among the models in table 2. 

To quantify the experiments analogous to (8)—(10) above, consider 
the one-sided forward solution to (20): 


I, - 




a£i(l ~ k) 


*1 i =0 



Ky t+ „ 


(25) 


where k is determined (from the characteristic polynomial of [20]) by 
K , 1 (1 _ vT - 4aP?). (26) 

2 pi 

From (25), the current impact of an unanticipated unit pulse in price 
that is thereafter expected to last exactly T periods is 


dh m y i 

dP, T aPi ,4o 


02 k 
oPi 1 - K 


(1 - K r+ ‘) 


(27) 


and is increasing in T. Note that even for T = 1, the response exceeds 
the price coefficient in table 2 because the experiment in (27) allows 
future levels of planned investment to adjust optimally. As in Section 
III, the response path of investment to a permanent change in P 
allows comparison between short- and long-run supply response. 
Straightforward calculations give the time path as 


dl,+, m Pz k I /. _ K_y + 1 

dP op] 1 k 1 — (k la) \ a) 


(28) 


which is increasing in j. The change in long-run equilibrium levels of 
investment is obtained by letting j -* «. 


s The shifter for the winter of 1979 has a larger coefficient in the adjustment cost 
model of table 2 than in table !. We induded this variable on examining the residuals 
for models in table 2. Excluding it has no appreciable effect on the other estimates. 
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Estimated Supply Elasticities for Permanent and Transitory Price Chances 
(Evaluated at Sample Means) 


Model 

A. Current Response to a Price 
Shock Lasting T Quarters 

B. Response by Quarter to a 
Permanent Price Increase 

T - 1 

T = 4 

T = 8 

7 * ® 

r = 1 

r - 4 

0D 

II 

J* as 00 

I. k - .82 

.72 

2.18 

3.15 

3.94 

3.94 

12.27 

18.22 

23.83 

2. k - .34 

1.04 

1.64 

1.68 

1.68 

1.68 

2.69 

2.76 

2.76 

3. k = .22 

1.18 

1.51 

1.51 

1.51 

1.52 

1.93 

1.93 

1.93 


Note.— x computed from eq. (27). 


Table 3 reports supply responses for the models in rows 1, 2, and 6 
of table 2. For ease of interpretation, the effects are expressed as 
elasticities evaluated at sample means; for example, the first entry of 
0.72 corresponds to equation (27) with T = 1, pi = 0.496, and p 2 - 
805.76. For these parameters the impact of adjustment costs on sup¬ 
ply decisions is large (k = 0.82), so adjustments are spread over a long 
period of time (see panel B). The current response to a permanent 
price increase has an elasticity of 4.0, and the long-run supply elastic¬ 
ity is nearly 24.0. These estimates are very large because no allowance 
is made for time to sale effects and pi is overestimated in that 
specification. Rows 2 and 6 produce smaller adjustment cost effects. 
In these models, a permanent price increase has a 50 percent greater 
impact on current investment than a one-period price shock does. 
However, almost all this difference is accounted for by a relatively 
short disturbance lasting 1 year. In our judgment the best estimate of 
the long-run elasticity of supply is model 2, which yields an elasticity 
of 2.76. For comparison, the rising supply price model in table 1 
yielded an elasticity of 2.08 for both long- and short-run price 
changes. 

R. Demand 

It has not been possible to estimate meaningful demand parameters 
from these data. There are two reasons for this. One is a limitation of 
data, and the other is an anomaly in imputed rents during the 1974- 
79 period. 

Estimating equation (16) requires constructing a time series on 
stocks K t using perpetual inventory methods. But investment is such a 
small fraction of existing homes that the imputed stock series is too 
smooth and trendlike to be informative about demand. The quasi- 
difference form in equation (17) uses directly observed investment, 
but prices appear in second-difference form, and this compounds 
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Fig. 2.—After-tax real rent, 1977 dollars 


measurement errors and timing problems in the price data. The sec¬ 
ond-differenced price series is so noisy that it too is uninformative 
about demand parameters. The data simply do not allow u» to esti¬ 
mate this component of the model. 

Figure 2 graphs a time series of the after-tax rental price index R, 
imputed from real stock prices (see the Appendix for details). Visu¬ 
ally filtering quarter-to-quarter noise, it exhibits disturbingly low 
values during the 1974—79 period, when relative housing prices in¬ 
creased dramatically, and rises rapidly thereafter. Feldstein (1982) 
has pointed out that general price inflation during this period in¬ 
creased the income tax subsidy to home ownership, making housing a 
more attractive investment than other assets and causing its relative 
price to rise. But figure 2 suggests that taxes were only part of the 
story. To the first order, greater subsidies should be capitalized in 
house prices but leave real rents unchanged for given demand for 
housing services. To the second order, higher asset prices encourage 
greater investment, increase the stock of homes, and decrease rents a 
little. But after-tax rents in figure 2 declined far too much during 
1974-79 to be attributed to the minor increase in stocks. The marked 
increase in implicit rents during 1979-82 suggests that capital gains 
expectations were too pessimistic in the 1974-79 period. 

The values in figure 2 are ex post rentals. Ex ante rentals would 
exhibit less of a decline if the market systematically underpredicted 
capital gains over that period. The negative ex post real interest rates 
observed during the inflation support this possibility, but we are un¬ 
able to construct an ex ante series on capital gains that differs from 
the ex post realizations in any meaningful way. Retroactively, it b too 



HOUSING INVESTMENT 


737 

easy to make one-step-ahead forecasts of prices within the sample 
period, a difficulty associated with the “overfitting” of instrumental 
variables noted above. Rosen, Rosen, and Holtz-Eakin (1984) suggest 
that uncertainty increased during the period when housing prices 
increased. Attaching an additional risk premium to the real rate of 
interest in the rental imputation would indeed temper the decline in 
rents shown in figure 2. However, there is no persuasive evidence that 
real mortgage interest rates rose or that mortgage credit was exces¬ 
sively rationed over the period in question, so the decline in rents 
remains an unresolved question. 

V. Conclusion 

The main empirical findings support the view that investment re¬ 
sponds elastically to changes in asset prices. The estimated long-run 
supply elasticity of about 3.0 is the largest that has been found so far 
in quarterly time-series data (see the surveys by Olsen [1986] and 
Weicher [1979]). The estimated short-run supply elasticity of 1.0 is 
much smaller than the long-run elasticity, but the differences between 
the two converge within the time frame of 1 year. There are good 
economic reasons for rapid convergence in the construction industry. 
Labor and other resources used in house construction are not highly 
specialized to the industry and are widely used in all sectors of the 
economy. Perhaps the pronounced seasonal and cyclical fluctuations 
in construction promote a certain adaptability and built-in flexibility 
in the organization of the industry that allow resource movements to 
respond quickly to changing economic conditions. 

The estimates also reveal deficiencies of an investment model based 
on homogeneous capital and cosdess auction market assumptions. 
The evidence that nominal interest rates and expected waiting time to 
sale have large direct effects on housing investment is not consistent 
with these assumptions. Better understanding of the timing of trans¬ 
actions and of market participation is necessary to fill out knowledge 
of dynamics. Fragmentary data on transactions volume in the overall 
housing market appear to be positively correlated with housing starts. 
Externalities of matching and search imply that it is more advanta¬ 
geous to participate in an active market than in an inactive one. The 
implied intertemporal substitution may provide the link between asset 
pricing, new construction, and transactions volume now missing from 
conventional capital theory. It remains to be studied in detail. None¬ 
theless, the large price elasticity of supply of new houses estimated 
here must be an important consideration for understanding the great 
variability in housing investment. 
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Date Appendix 

Time-series data used in the empirical work were obtained from the following 
sources. 

New single-family housing prices. —The price data were obtained from a sur¬ 
vey conducted by the Bureau of the Census since 1963 for new single-family 
homes actually sold during the reference period. The index refers to charac¬ 
teristics of a standard 1977-quality house as obtained from a hedonic regres¬ 
sion of actual price data on a vector of house characteristics in each year. 
Source: U.S. Bureau of the Census, New One-Family Houses Sold and for Sale 
(Construction Reports, ser. c25). 

Investment. —Housing starts are new one-unit structures on which construc¬ 
tion was started during the reference period. Similar results were obtained 
from real dollar values of gross investment and are not reported. Source: U.S. 
Bureau of the Census, Construction Reports, series c20. 

Interest rates. —The nominal rate of interest for the supply function is the 3- 
month Treasury bill rate quoted by Salomon Brothers on the last day of the 
previous period. The real rate used is the one-step-ahead forecast from an 
estimated AR(2) regression in the first differences of the real rate (Fama and 
Gibbons 1982). Since r, is estimated, standard errors are corrected (Murphy 
and Topel 1985). Mortgage interest rates for first mortgage loans on single¬ 
family homes are published by the Federal Home Loan Bank Board. The 
series used to construct R, refers to the effective interest rate on 25-year 
maturity loans with a loan to price ratio of 25 percent. 

Months. —Median months on the market for new units sold during the 
quarter. Source: Unpublished data obtained from the Bureau of the Census. 

Boeckh cost index. —A weighted average of construction input prices for 
small residential structures. Source: U.S. Department of Commerce, Bureau 
of Industrial Economics, Construction Review. 

Personal consumption expenditures. —Source: U.S. Bureau of Economic Anal¬ 
ysis, The National Income and Product Accounts of the United States. 

Families. —The number of married-couple family households. Source: U.S. 
Bureau of the Census, Current Population Reports, series p-20. 

Fuel price index. —Source: U.S. Department of Labor, Bureau of Labor 
Statistics, Monthly Labor Review. 

Real implicit rental price. —Define the income-tax-adjusted real interest rate 
as t, — (1 - t t )i t — ir„ where i t is the nominal interest rate, t, is the marginal 
income tax rate, and n, is the rate of inflation. Anticipated real rent is the 
expected present value of a round-trip buy and sell transaction over one 
quarter (ignoring transactions costs), or R t = P, - £,P, + 1 (1 - 6)/(l + f t ), 
where P, is the real asset price and 6 is the quarterly depreciation rate, cal¬ 
culated at 0.0035 per quarter from a perpetual inventory method. This ex¬ 
pression ignores maintenance expenditures and property taxes and assumes 
no taxation of capital gains (see Hendershott and Hu [1981] and Dougherty 
and Van Order [1982] for a discussion of those refinements). The ex post 
numbers shown in figure 2 replace the expectation with realized values, using 
the 3-month Treasury bill interest rate and assuming that capital gains are 
taxed at rate t,. Assuming no taxation of capital gains yields a series with the 
same general appearance but with more pronounced fluctuations and a much 
larger drop in rent during 1974-79. Two alternative estimates of t, were 
tried. One is Barro and Sahasakul's (1983) estimates of the average marginal 
tax rate; the other is the estimated tax bracket that makes tax-free municipal 
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bonds a marginally profitable investment. In figure 2, r, is set at 0.3. The time- 
series character of the R t series is insensitive to these differences in taxes. The 
quarter-to-quarter noise in figure 2 arises from measurement error in price 
differences in the computational formula. 
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This paper analyzes a monopolist that markets successive genera¬ 
tions of new and improving nondurable products. Prices, research 
intensity, and product innovations are derived as sequential equilib¬ 
rium outcomes to a dynamic game with incomplete information. 
Asymmetric information is an important feature of the model. The 
monopolist is fully aware of the current product’s quality, as are 
consumers who have tried it. However, the beliefs of other people 
are characterized by a probability distribution that depends on the 
monopolist's marketing strategy and the product’s popularity. The 
analysis illustrates a new context in which price signaling might serve 
as a mechanism for ensuring that only high-quality products are 
marketed. More important, it shows how product life cycles are gen¬ 
erated in the absence of signaling and how a reputation for produc¬ 
ing high-quality goods becomes established in such cases. 


I. Introduction 

When a new product is introduced to the market, the producer typi¬ 
cally knows more about its attributes than most consumers. To max¬ 
imize discounted profits, the firm should widely publicize those attri¬ 
butes that consumers find most appealing but be less frank about the 
product’s undesirable ones, provided warranties are too costly to en¬ 
force and there are no reputational effects that might reduce its sales 
of other goods. Even so, uninformed people presumably account for 
such bias when deciding whether to purchase the product or not. 
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Later on in the product's market life, these issues become much less 
important, for over time consumers learn all about the salient charac¬ 
teristics through their personal experiences as well as from indicators 
of the product’s overall popularity. 

This paper investigates the diffusion process described above, view¬ 
ing it as the equilibrium outcome from a dynamic game with incom¬ 
plete information. Section 11 lays out a model in which a monopolist 
markets a succession of new and improving nondurable products. 
The event sequence that occurs during a typical period of the game 
proceeds as follows. Suppose that a given product was sold last pe¬ 
riod. The monopolist must now decide whether to withdraw it and 
undertake more research in order to discover a new product or, if 
not, what price should be charged for the retained product. Once the 
price of a marketed product has been set, consumers (some of whom 
may be more informed than others) decide whether to buy it. At the 
time each new product is introduced, all consumers are less informed 
than the monopolist about its characteristics, but they can learn about 
them by buying the product or, alternatively, by later making infer¬ 
ences from those who have. 

The equilibrium in this model yields the rate at which research 
is undertaken, the birth and death times of successive products, 
the diffusion of information about each product’s characteristics 
throughout the population, and the monopolist’s pricing policy. Sec¬ 
tion III establishes the existence of a sequential equilibrium satisfying 
two additional refinements and then characterizes the equilibrium 
outcomes of such equilibria. 

Essentially one of three scenarios applies, thus determining 
whether and when a lower-quality product will be introduced and 
withdrawn from the market, how long it takes for the reputation of a 
higher-quality product to be established, and what the price policy is. 
A reason for withdrawing a lower-quality product from the market is 
that its customer base would vanish if people who had not tried out 
the product learned from the experience of those who had. In this 
setup uninformed buyers eventually deduce quality from retro¬ 
spectively observing the aggregate quantities sold. However, similar 
results would emerge if neighbors could inspect each other’s retail 
purchases. Suppose momentarily that such public information is the 
only deterrent to marketing lower-quality products, the first scenario. 
Then all products are introduced irrespective of quality, and the price 
of each subsequently declines over time. The decline reflects the di¬ 
minishing value to consumers of acquiring private information about 
product quality as the date approaches when it will be made publicly 
available. At that point a lower-quality product would be invariably 
replaced by a new product, whereas a higher-quality product would 
be publicly revealed as such. 
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Even if the reason given above is unimportant, lower-quality prod¬ 
ucts are not necessarily marketed indefinitely, as the second scenario 
shows. Granted, research is a cosdy activity, so there may be a positive 
probability of marketing defective products when this is private infor¬ 
mation to the supplier. Furthermore, the benefits of immediately 
undertaking more research are temporarily reduced when any prod¬ 
uct is introduced because the findings of such research might render 
the current product obsolete. However, as the total number of people 
who have tried out a defective product increases, the uninformed 
proportion of the population declines, shrinking demand. Conse¬ 
quently research is increased. In turn, more research activity raises 
the probability of superseding the current product. Hence the price 
of any surviving product increases because the longer a lower-quality 
product has been on the market, the more likely it would have been 
withdrawn previously. In fact, there comes a time when a lower- 
quality product is withdrawn for sure because even if it was priced as a 
higher-quality product and the remaining uninformed people (mis¬ 
takenly) had no doubt that it was a higher-quality product, sales reve¬ 
nue would still not compensate the net benefits from introducing a 
new product to the whole population. 

The two scenarios mentioned above illuminate the important role 
of consumer uncertainty about quality in generating intertemporal 
competition between product generations supplied by the same firm. 
In some circumstances this can induce dynamic signaling. Suppose 
that the monopolist would introduce a higher-quality product at a 
very low introductory price, anticipating high demand for it in subse¬ 
quent periods. Then consumers might refuse to buy any product 
introduced at a higher price. If it is unprofitable for the monopolist to 
introduce a lower-quality product at the same price because future 
demand for it is less, this (third) scenario could represent equilibrium 
behavior. 

Previous published works in several different areas are related to 
the analysis undertaken here. There is a distinction, commonly made 
in the marketing literature, between innovative consumers, who ex¬ 
hibit greater willingness to experiment with products of lower ex¬ 
pected quality and pay a premium price for being first, and more 
cautious, imitating consumers, who buy the product only if its price 
tails or after its reputation has become established (see, e.g., the char¬ 
acterization of adopters given by Rogers [1983, pp. 241-70]). These 
differences in consumer behavior have been attributed to tastes and 
motivate economic models of intertemporal price discrimination such 
as Stokey's (1979). However, this analysis shows that such behavior 
may be observed in a homogeneous population as well. Before the 
product’s characteristics are well known, the price an uninformed 
person is willing to pay accounts for future opportunities he might 
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have to exploit the private information gained through consumption. 
Because this information is rendered valueless when the characteris- 
tics become publicly revealed, the earlier it is acquired the more valu¬ 
able it is. Thus the reservation price of a product whose characteristics 
come from a given probability distribution falls over time, or if the 
price rises, the probability that a product will be unsatisfactory must 
fall. 

This paper also contributes to the theory of consumer behavior 
toward experience goods, as formulated by Nelson (1970), Grossman, 
Kihlstrom, and Mirman (1977), Hey and McKenna (1981), Wilde 
(1981), and others. First, it demonstrates how optimization problems 
such as theirs may arise from an equilibrium setting. Here one finds 
that nondurables produced by the same supplier at different times 
compete with each other to some extent because consumers are not 
fully informed. Also, the probability distribution describing a prod¬ 
uct’s characteristics is itself endogenized through the monopolist’s 
research and marketing strategies. This second remark also suggests 
the possibility that, in equilibrium, the benefits of information are 
offset not by higher prices but by products of lower expected quality. 

Third, the equilibrium concept invoked is essentially a further 
refinement of those developed by Selten (1975) and Kreps and Wil¬ 
son (19826). Similar in spirit to Cho and Kreps (1987) and Banks and 
Sobel (1987), but most closely related to Milgrom and Roberts (1986), 
the analysis focuses on those sequential equilibria that the monopolist 
would prefer to play should its attempts to innovate prove successful. 

Associated with the concept of equilibrium in environments with 
incomplete information is the process of acquiring a reputation. In 
this respect the analysis undertaken here is more similar to Kreps and 
Wilson’s (1982a) game-theoretic treatment of the chain store paradox 
than Shapiro’s (1982) decision-theoretic approach to reputation 
building. For as in the former, but in contrast to the latter, the beliefs 
of uninformed consumers are modeled as probability distributions 
that are updated using Bayes’s rule as new information (itself endoge¬ 
nously determined in equilibrium) arrives. 

Finally, how to maintain a reputation is the subject of articles 
by Dybvig and Spatt (1980), Klein and Lefiler (1981), and Shapiro 
(1983). These authors extend Friedman’s (1971) development of trig¬ 
ger strategy equilibria for supergames to situations in which there is 
asymmetric information about product quality. As mentioned earlier, 
high quality is assured in the third scenario. However, the enforce¬ 
ment mechanism analyzed here does not involve consumers collec¬ 
tively punishing the deviating monopolist that introduces a low- 
quality product by, say, paying less for all future new products 
(although this kind of sequential equilibrium also exists in some re- 
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gions of the parameter space). Rather, high quality is guaranteed by 
low introductory offers in a manner analogous to Spence’s (1973) 
pioneering study of signaling. Reputations are thus identified with 
particular products rather than the firm itself, a feature that seems 
more appropriate the larger the touted innovation and the less 
diversified the product lines. 

II. The Model 

An Overview 

The game is very simple. The product’s characteristics space is sum¬ 
marized by one variable, quality, which can take only two values. 
Moreover, consumers do not demand low-quality products, or 
“fakes," at a positive price. High-quality products are called “cures.” 
Apart from price, many other factors influence the demand for a 
product with a given set of characteristics; in this paper they are 
modeled as a Bernoulli random variable, which is independently and 
identically distributed across the population and over time. "Sickness” 
and “health” represent the two possible outcomes. 

Play proceeds as follows. Each period a proportion of the popula¬ 
tion catch a disease that lasts one period. There is a sole supplier of 
drugs: after paying an initial fixed cost for its research laboratories, 
this monopolist faces a fixed probability of discovering a cure at date 
0. If found, the cure could be marketed in the first period and forever 
after. But even if it is unsuccessful, the firm may choose to sell a fake 
to consumers for one or more periods before withdrawing it from the 
market and conducting another experiment. Alternatively, the firm 
might not enter with a fake in the first period, opting instead for 
another experiment in the hope of introducing a cure in period 2. 
The profitability of deceptive practice is attributable to the existence 
of asymmetric information. Although consumers know when an ex¬ 
periment is conducted, they are not automatically privy to the out¬ 
come. Hence sick consumers must decide whether or not to buy a 
prescription on the basis of incomplete information. Those who do 
buy are immediately cured if and only if the drug works, in the 
process acquiring full information about its quality. 

Preferences 

How individuals react to the introduction of new drugs depends, 
among other things, on the nature of product demand, how reliable 
their personal experience is in evaluating product quality, whether 
they can infer anything from their friends’ experiences, what market 
aggregates are published (either officially or as advertisements), and 
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what the decision rules of suppliers are. In this framework, a cor 
tinuum of consumers, distributed uniformly on the [0, 1] interv, 
have identical preferences, caring about the goods x they consum 
and also their health 2 . Each person’s tastes may be represented by 
time-additive utility function ST-o z,). where U(x, 2 ) is concav 

increasing in both arguments and 0 E (0, 1) is the discount ran 
There are only two states of health z S {0, 1}, and a is the probabilit 
that z — 0 (falling sick); for convenience draws are distributed ind< 
pendendy across periods and people. 

It is helpful to define u(p, 8), the expected utility a sick perse 
attains in a period from buying a prescription at price p that work 
with probability 8 (given income y, which bounds their expenditur 
each period and is henceforth suppressed). 

Definition 1 . u(p, 8 ) = 8 U(y - p, 1 ) + (1 - B)U(y - p, 0 ). 

Notice that u(0, I) is the utility of a healthy person, which withoi 
loss of generality may be normalized to zero, and w(0, 0) is the utili 
of a sick person who does not take medication. Substitution an 
differentiation show u\(p. 8) < 0, u 2 (/», 8) > 0, u u (p, 0) < 0, an 
u 22 (p, 0) = 0 (where the subscript i 6 {1, 2} indicates partial diffei 
entiation with respect to the ith argument). 

In all the equilibria analyzed here, if he had the choice, an ir 
formed sick person would buy a cure provided its price is less than 0 
equal to p, defined below. 

Definition 2. u(p, 1) = u( 0, 0). 

To avoid cluttering the exposition, this behavior is imposed as pai 
of the environment. Accordingly, let 8 (p,) denote a sick person’s d< 
mand for a known cure. From definition 2, if p, s p, then 8 (p,) = 1 
whereas if p t > p, then 8(/>,) = 0. Also let c, represent the quality of th 
drug most recently introduced by the monopolist: 


c, = 



the firm produces cures in period t 
it produces fakes in period t. 


(1 


Then demand for the drug by an informed sick person at time t i 
simply c t h{p t ). 


Information and Technology 

There are constant returns to scale. Irrespective of quality, drugs co: 
w each to produce. Incomplete information arises in the model bi 
cause only the monopolist undertaking research directly observes tl 
outcomes of the experiments it conducts. The introduction of a ne 
brand in the Ith period is denoted by 
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, ( 1 a new brand is developed in period t 

b,= [0 otherwise. (Z) 

The probability of conducting a successful experiment is 80 . 

Within the class of equilibria considered here, if it had the choice, 
the monopolist would never withdraw a cure from the market. So to 
reduce the notational burden, the analysis simply imposes this behav¬ 
ioral restriction at the outset. Taken together, these remarks imply 
(b 0 , co) = ( 1 , 0 ) and 

Pr{c <+1 = \\c„b t ) = c, + 6 0 b t (l - Ci). (3) 

As mentioned above, uninformed sick people can become in¬ 
formed about a particular brand by trying it out when they are sick. 
Let A, be the number of people (or, equivalently, the population 
proportion) who are informed at time t\ also let q, denote the number 
of people who buy the drug then. If a new brand is introduced in 
period t, the number of informed people drops to zero (i.e., b t — 1 
implies A, = 0). Alternatively, when the current brand is retained 
(i.e., when b, = 0 ), the proportion of people who are informed in¬ 
creases by the number who huy the drug, q„ less those who are repeat 
purchasers, a A, c f 8 (/>,). (Notice that aA, is the number of informed sick 
people, while cfiip,) indicates whether they purchase the drug or not.) 
Therefore, the law of motion for A, is 

Ai+i * (1 - b t+ i)[A, + q t - aA t c,8(p,)]. (4) 

The public record of a game’s history, h‘ G //', comprises a vector 
sequence of new brands introduced b t , prices posted p„ and quantities 
traded q,. In symbols, k, = (b„ p„ q,) and h‘ = {A,K - \. Sick consumers 
decide whether to purchase medication (this action being denoted by 
■y ( = 1 ) or not (denoted -y< — 0 ); they rely on public records of the 
game history h 1 plus current prices p, to determine their subjective 
probability 0 ( G [ 0 , 1 ] that the most recently introduced brand works 
and to make their choice -y, G {0,1}. Again, to avoid complications that 
occur off the equilibrium path, it is assumed that all sick uninformed 
people behave the same way. Consequently, q, G {0, aA„ a(l - A,), a}. 

If the monopolist does not conduct an experiment, it announces a 

• price p t at which it supplies all customers. This price is a positive real 
number that potentially depends on the monopolist’s private infor¬ 
mation c t and the game’s history h 1 to date. Alternatively, the monop¬ 
olist temporarily withdraws from the market to conduct another 

• experiment. Let i|> ( , a function of h', denote the probability that an 
experiment will be conducted at time t. Then b 0 = 1 and 

Pr{ 6 , = 1 |c„ = (1 - c t )i|i,. 


(5) 
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By preventing the monopolist from marketing a fake while simulta¬ 
neously trying to discover a cure, the model crudely captures some 
variable costs of conducting research. (For if people observed the firm 
undertaking research, they would deduce that it had not discovered 
the cure yet.) 

To summarize, an assessment, denoted by A, fully characterizes the 
beliefs and actions of the players in the game. This game is concerned 
with the beliefs of the uninformed sick 0(A', p t ), their choice whether 
to take medication y(A*, p,) or not, the price of drugs not withdrawn 
from the market p(k\ c,), and the probability of withdrawing a fake 
from the market Mb')- Thus the assessment A is defined, for each h‘ E 
H 1 and c, E {0, 1}, as the fourtuple A(h\ c,) = (8 t , y t , p„ <|i ( ). 


Opportunity Costs and Reservation Prices 

The interesting aspects of incomplete information in the model arise 
because the monopolist is unable to commit itself to truthfully disclos¬ 
ing product quality. This creates a tension between its objectives be¬ 
fore a cure is discovered (when it wants to mislead consumers) and 
those afterward (when its aim is to fully reveal the product’s charac¬ 
teristics). The tension is reflected in the value of owning the monop¬ 
oly at different stages in the game. In particular, given an assessment 
A, the initial value of the game, denoted to the monopolist V(A) and 
abbreviated by V, is 

oo 

V(A) = Eoj21(l ~ b ‘) a $‘(pt ~ v>)[A,c t b(p t ) + (1 - A,)y,]J. (6) 

Here, the tilde on a variable indicates its dependence on A. To inter¬ 
pret (6), observe that the expectation Eo is taken over the sequence of 
random variables {b t , <:,},*« o. determined by a probability distribution 
parameterized by 0 and o- If a new brand is being developed in 
period t, then b t — 1 and sales are zero. Otherwise the net return per 
unit sold discounted back to zero is {$'(/>, - w). Aggregate demand by 
the informed is aA,c,h(p,), while demand by the uninformed is a(l - 
A,)y t . When a cure is discovered, the value of the monopoly becomes 
V'(A), abbreviated by V', defined as 

QQ 

V'(A) = £ «&(p' ~ w)[Afi(pt) + (1 - A,)y«J. (7) 

0 

In contrast to (6), equation (7) does not depend on the sequence of 
random variables {b t , cjr_ 0 . Brand turnover stops once a cure is found. 

In view of the last sentence, some time after a cure has been in¬ 
troduced, uninformed consumers might come to believe that the 
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product surely works and be willing to pay p for it; this is when the 
reputation of a cure becomes established. More formally, for a com¬ 
plete history of the game {Ajr-o and given an assessment A, let t(A') be 
the first time that i(h‘, p) — 1. Because h t is a stochastic process in¬ 
duced by the monopolist’s research and marketing strategies, t is 
random. 

Definition 3. f(h‘) ~ min{<: ^(h‘, p) - 1|A‘}. 

As explained below, the analysis focuses on assessments with a re¬ 
cursive form (see condition 1 in Sec. III). In these assessments, an 
unsuccessful monopolist incurs an opportunity cost by postponing its 
research program in order to market fakes. Given A, let w( A, V) 
denote the price at which an unsuccessful monopolist is indifferent 
between selling all the uninformed a fake in the current period and 
withdrawing it next period versus withdrawing it from the market 
immediately, when A people are informed. The second option is 
worth V (the initial value of the game to the monopolist), while 
the current value of the first option to the monopolist is a(J — 
A)[w(A, V) — w] + pv. Notice that a(l - A)[u>(A, V) - w] is the net 
revenue from current sales and pV is the value of the second op¬ 
tion discounted back one period. Definition 4 follows from making 
ui(A, V) the subject of the expression that equates these two quantities. 

Definition 4. u/(A, V) * w + [a(l - A)]~‘(l - p)V. 

It will be shown below that if A is an equilibrium assessment, then 
V > 0; hence w( A, V) > w. (This follows from the fact that research 
has a positive net value in equilibrium and hence is costly to delay.) 
Observe that w( A, V) increases as the product’s customer base for a 
fake erodes (i.e., in the proportion of the population who are in¬ 
formed), diverging to infinity. (To compensate the monopolist for 
delaying its research into superior products for a period, higher 
prices must offset lower quantities sold to the remaining unin¬ 
formed.) 

As the Introduction mentioned, demand for treatment by the unin¬ 
formed depends on two factors, namely, prices (current and future) 
and expected quality. First, consider a drug whose reputation will be 
established next period. Whether an uninformed sick person buys 
this product depends purely on his current utility. For a drug that 
works 0 O proportion of the time, let <v be the price that equates the 
utility of an uninformed sick person from buying it with that attained 
from not being treated. 

Definition 5. u(«, 0 O ) = u(0, 0). 

Also, given A, let <t>(A„ V) be the minimum subjective probability an 
uninformed sick person would entertain and still buy treatment at the 
opportunity cost of producing fakes when the drug’s reputation 
would be revealed next period. 
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Definition 6. «[w(A, V), 4>(A, V)] = a(0, 0). 

Because a/(A, V) is increasing in A and V, so is 4>(A, V). The more 
people who have tried the product once, the greater the degree of 
confidence the others place in it. There is an economic rationale for 
this bandwagon effect. The higher is the informed proportion of the 
population, the lower is the customer base of an unsuccessful monop¬ 
olist, so the more likely a fake is withdrawn from the market and, 
hence, the greater is the degree of confidence uninformed consumers 
place in retained products. 

By theorem 2 in the next section, the reputation of a cure takes at 
most two periods to establish. So, in addition to to and <(>(A, V), one 
must consider two-period demands. Denote by v(p) the reservation 
price of an uninformed sick person with subjective probability 0o, who 
acquires with his purchase the option to repeat purchase at price p 
next period. 

Definition 7. u[v{p), 0 O J = (1 + ap0 o )u(O, 0) - a{J0 O M {p, l). 

When next period’s price equals p, the reservation value for a cure, 
there are no gains from acquiring private information. Consequently, 
an uninformed sick person is prepared to pay for benefits only accru¬ 
ing in the current period, that is, up to to (since the drug works with 
probability 0 O ). Therefore, v(p) = a>. Lowering the price next period, 
or increasing the probability of falling ill and the weight attached to 
future utility, raises the value of acquiring private information. Simi¬ 
larly, the greater the chance of success to the firm, the higher are 
expected current benefits from taking treatment and the more likely 
the private information will be exploited. Thus v{p ) is a decreasing 
function, while for each p E (0, p), the mapping is increasing in 0<j, a, 
and B. 


III. Equilibrium 

Existence of Equilibrium 

Sequential equilibria are assessments in which the players in the game 
are sequentially rational and hold consistent beliefs (see Kreps and 
Wilson [1982&] for a theoretical analysis). This paper focuses on a 
subset of sequential equilibria exhibiting two distinctive features. 
First, the equilibrium beliefs and actions of players do not depend on 
events that occurred before the introduction of the current brand. 
Condition 1. If > 0, then A(h‘, c,) is independent of h r . 
Sequential equilibria that do not satisfy condition 1 include trigger 
strategies, which Dybvig and Spatt (1980), Klein and Leffier (1981), 
and Shapiro (1983) have studied in related modeling environments. 
While trigger strategy equilibria explain why the trademarks of repu- 
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t abl e firms are conferred on the brands they produce, the question of 
how and indeed whether a new brand can acquire a reputation for 
high quality in the absence of support from a reputable parent firm 
still remains unanswered. 

Among the sequential equilibria satisfying condition 1, only the 
most profitable one for a successful monopolist is examined. Loosely 
speaking, the rationale for this further refinement is that the monop¬ 
olist should be able to communicate which equilibrium it expects ev¬ 
eryone to play, through its pricing policy. (After all, the monopolist is 
the first mover each period, and everyone else observes its actions.) If 
so, once a cure is discovered, the monopolist picks the subgame equi¬ 
librium outcome that maximizes its discounted flow of net returns 
calculated at that time. Therefore, in the event that a research experi¬ 
ment is unsuccessful, the monopolist, when introducing a fake, is 
obliged to announce the prices everyone anticipates of a successful 
monopolist, to avoid immediate detection. 

Condition 2. The assessment A is chosen so that the sequence 
{pt, o maximizes V'(A) subject to the constraint that it is a sequen¬ 
tial equilibrium satisfying condition 1. 

Aside from refining the set of Nash equilibria, the parameter space 
is also restricted. Once some fraction of the population becomes fa¬ 
vorably informed, but before its reputation is established, a successful 
monopolist might optimally charge p and sell prescriptions to in¬ 
formed people only rather than to uninformed sick people as well (at 
a lower price). It is straightforward to incorporate such behavior into 
the analysis. However, the main results are not affected by it, and 
extra notation is required. So to make the exposition more manage¬ 
able, assumption 1 is imposed throughout. Under this assumption 
(which essentially bounds 0o from below and fl from above), the be¬ 
havior described above never occurs in equilibrium. 

Assumption 1. 6 0 > (1 - P)[P(20 + l)(2p - 1)]"‘. 

One way of establishing existence is to display a sequential equilib¬ 
rium satisfying condition 1. Since the results derived below show that 
there are only a finite number of equilibrium outcome paths to con¬ 
sider, a maximum for the problem described in condition 2 exists. 

Theorem 1. Given assumption 1, a sequential equilibrium satisfying 
conditions 1 and 2 exists. 

The Appendix contains all the proofs. 


Equilibrium Outcomes 

Associated with the assessment A are the outcomes it generates. Let 
H* denote the set of h 1 partial histories that can arise from playing out 
A for the first t - 1 periods a brand is marketed, and from now on 
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suppose that A is any sequential equilibrium assessment that satisfies 
conditions 1 and 2. Describing the equilibrium outcomes for this 
game amounts to writing down the beliefs and actions of the various 
players, that is, A(h‘, c,), for each equilibrium history to date, h‘ e 
and drug type, c t 6 {0,1}. The order in which the three scenarios were 
introduced is now reversed for expository purposes. 

Suppose that the successful monopolist establishes its reputation 
after one period by making a low introductory price offer that would 
have been unprofitable had it not discovered a cure. Under these 
circumstances the monopolist's net revenue in the first period of mar¬ 
keting a cure is a(p\ - w), and from then on it nets a (p - w) per 
period. Hence the value of the monopoly on discovery, Vo. is 

V6 = «p[<* -w)+ P f ~j^ ] - (8) 

Since the probability of discovery is 0 O , it follows that the initial value 
of the game to the monopolist is V 0 = 0 o Vo[l + (1 - 0 O )P + ...]. 
Summing this infinite geometric series and substituting for Vo from 
(8), one obtains 

V 0 - «P0 O [tf, - w) + ] (l - p + OoP)- 1 . (9) 

There are two cases to consider, if w(a, V) 2 : p, then an unsuccessful 
monopolist would invariably withdraw its fake after one period any¬ 
way; provided the successful monopolist sets p\ :s w(0, V), no fakes 
are introduced. Thus when the monopolist discovers a cure, its cur¬ 
rent value is ap{w(0, V) — w + [p (p - w)/(l - P)]}. This occurs at 
date t with probability 0(1 — 0)‘. Multiplying the product of these 
expressions by p' and summing over t, one obtains the present value 
of monopoly at date 0. Then, with definition 4 used to substitute for 
w(0, V), some straightforward manipulations yield the initial value of 
the game to the monopolist, which in this case is 

V, = ap 2 0 o (/> - b>)[( 1 - P)(l - P + 0 O P - ap0 o + ap 2 0 o )]~'. (10) 

The other case occurs when w(a, V) < p; this inequality implies that 
an unsuccessful monopolist would continue marketing a fake it in¬ 
troduced in the previous period if the price was p. Then entry by the 
unsuccessful monopolist into the market is deterred if Vo is not less 
than the value of introducing a fake and marketing it for two periods, 
namely, a[(po — w) + (1 — a)P(^ - w)], plus the value of the game 
discounted back two periods, P 2 Vo- (After two periods, uninformed 
people deduce that the drug was a fake from aggregate sales data. 
This is discussed later in more detail.) Equating the value of this 
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option with Vo, one solves for po to obtain 

Po = w- 0(1 + a0 - a - 0 - e 0 ap - e 0 p 2 )(- . p - f w . - A (11) 

Are there any other price couplets (pi, fo) that make marketing 
fakes unprofitable? After all, provided pi < w( 0, V) and p\ + 0p 2 < 
io(0, V) + 0(1 - a)w(a, V), only cures are introduced. Condition 2 
effectively precludes this by allowing the successful monopolist to pick 
the most profitable price couplet that satisfies these two inequalities. 
In the second period an unsuccessful monopolist gains only a(l - a) 
in sales revenue for price increases up to p compared with a by the 
successful monopolist, but in the first period their net returns are 
affected equally. Therefore, meeting the constraints imposed by his 
alter ego (the unsuccessful monopolist) at least cost requires the suc¬ 
cessful monopolist to choose (w( 0, V'x), p) or (po. p) rather than some 
other price couplet. 

Lemma 1. If ij»i = 1, then t = 2; if in addition u>(a, V ) > p , then 
(pi, 7i) = (u>(0, Vi), 1); alternatively, w( a , V) < p implies (p,, yi) = 
(po, 1). Finally, if f > 2, then $1 = 0. 

The second sentence in lemma 1 asserts that if a cure must be 
marketed for more than one period to become established, then every 
failure preceding the cure was marketed for at least one period. One 
price path the lemma rules out involves introducing fakes at w(0, V) 
with less than unit probability, marketing a proportion of those in¬ 
troduced for one period only and the remainder for the second pe¬ 
riod as well, at w(a, V). For certain structural parameter values that 
imply v(<o) < u>(0, V) < w( a, V) < p (the middle inequality being a 
direct consequence of definition 4), this outcome is indeed a sequen¬ 
tial equilibrium satisfying condition 1. However, the paragraph above 
implies p 0 + 0p > 10 ( 0 , V) + 0 10 ( 01 , V). Once successful, the monopo¬ 
list prefers to signal rather than price the cure at the opportunity cost 
of marketing fakes. Consequently, if p\ — 10(0, V) and p 2 = 10 ( 01 , V), 
then A does not satisfy condition 2. 

The characterization of the first two scenarios is developed in 
stages, drawing heavily on the opportunity cost and reservation price 
concepts defined in the previous section. As mentioned before, sick 
uninformed people may be willing to pay more than a price p that 
equates their current utility from being sick u (0, 0) with their ex¬ 
pected utility from taking the drug u[p, Q(h‘, p)] because there is value 
from acquiring information about its quality. Lemma 2 places an 
upper bound on their willingness to pay for this information; the 
reservation price of an uninformed person is certainly less than the 
reservation price for a known cure. Rather than pay more than p now, 
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the individual attains a higher utility by waiting until the price falls 
below p and buying the drug then (if he is sick again). 

Lemma 2. If p, > p, then W, p,) = 0 for all h‘ G H‘. 

The requirement of sequential equilibria that beliefs be consistent 
yields a dose relationship between the rate at which fakes are with¬ 
drawn from the market and the subjective probability uninformed 
agents hold about brand quality. Once some people have become 
informed, the price of a cure remains above their reservation value p 
until the last period of the component game; next period uninformed 
people infer from the quantity just sold whether the informed sick 
bought the product again. Because no one optimally repeats a pur¬ 
chase of a low-quality brand (for a positive price), its true quality is 
revealed to everyone else by their actions. In this way information 
about product quality inevitably seeps out, despite the fact that the 
opportunity cost of continuing to market fakes may be relatively low. 

Lemma 3. Suppose r(h‘) > t for some h‘ G H‘. If A, > 0 and p t s p, 
then t = / + 1 . 

From lemma 3 (which applies to all histories, not just equilibrium 
paths), only uninformed people buy the product until the period 
before its reputation is established. Their demand reflects both the 
current period's expected utility from using a product of unknown 
quality and also the value of private information simultaneously ac¬ 
quired. This information is exploited only if the product has not been 
withdrawn from the market at date t, and the person falls ill then. 

Lemma 2 implies that if the price exceeds p, sales will be zero. For 
this reason a successful monopolist sells drugs in successive periods at 
a price not exceeding p as soon as they are invented. Hence the 
reputation of a cure is established within two periods. The reasoning 
runs as follows. Some sick people who buy the drug when it is in¬ 
troduced fall ill again the following period. This group buys the drug 
a second time if and only if it cures. Consequently, at the end of the 
second period in a drug’s life, those people who have been healthy 
both periods infer from aggregate sales data whether the informed 
sick purchased the drug or not and hence its quality. Therefore, cures 
are priced at p from then onward, while fakes are marketed for two 
periods at most. 

Theorem 2. f G {2, 3} and ( 71 , 72 ) = (1, 1). 

The theorem implies that from period 3 onward cures are priced at 
p. The next lemma asserts that fakes are invariably marketed at the 
same price as cures. Otherwise uninformed consumers would be able 
to infer low quality merely by looking at the price of a fake; also their 
introductions and withdrawals are consistent with Bayesian expec¬ 
tations. 
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Lemma 4. If h l E H‘, then = 1 - §<_i(l - 8 <_i) _1 (l - ft,)®, -1 . 
Also p(h‘, 0) = p{h\ 1). 

Theorem 2 and lemma 4 imply that the equilibrium outcomes are 
fully characterized by the price path of a cure over the first two pe¬ 
riods (pi, § 2 ) and the beliefs of uninformed consumers (§i, 0 2 ). If p < 
a i(a, V), it takes one period to establish a cure; after marketing a drug 
to a proportion of the population, an unsuccessful monopolist finds 
that serving the remaining uninformed who fall sick next period is 
unprofitable. This is the second scenario the Introduction mentioned. 
There are two possibilities. Either to > w(0, V) or vice versa. In both 
cases fakes are withdrawn after being marketed only one period at 
most because w(a, V) > p, and, as lemma 2 shows, no uninformed 
person is willing to pay more than p for a drug. First, suppose <■> > 
w(0, V). Because there are no benefits from acquiring private infor¬ 
mation, the reservation price of consumers for a new product that 
cures with probability 0o is to. As that exceeds the opportunity cost of 
marketing fakes, every drug is introduced to the market. Second, let 
w( 0, V) > <a. That is, given beliefs of 0 O , the opportunity cost of 
marketing fakes exceeds a sick person’s reservation price. Accord¬ 
ingly, fakes are not always introduced to the market; the introductory 
price is n>(0, V), people believing that the drug works with probability 
<J>(0, V) at least. 

Lemma 5. Suppose t = 2 and 4»i < 1-Then p ss w(a, F). In this case 
(pi, pi) ~ V) V w > p) and 02 = 1 . Moreover, if n>(0, V) > to, then 
0 , 2 : <J»(0, V), but if to > w(0, V ), then 0j = 0 O . 

Some algebra shows that if (pi, p%) = (to, p) and (0j, 0 2 ) = (0o, 1), 
then the value of the game to the monopolist is V^, defined as 



0 o )(<o - 



(to - w ) + 


P(/> - w) -n 

1 - P 


x (l - p + 0 O 3 2 )-'. 


( 12 ) 


The expression on the right-hand side of (12) may be interpreted as 
follows. Every even period the monopolist randomly draws a drug 
from a laboratory that invents cures and fakes in the proportions @o 
and (1 - 0 O ), respectively. If a fake is drawn, the monopolist receives 
afi(to - to), but the payoff from a cure is a 3 {to — w + [|3(/> - 
w)/(l - 3 )]}. This game ends once a cure is drawn. The numerator in 
( 12 ) can be interpreted as the expected payment at the beginning 
of every even period the game is played, while (1 — 3 2 + 0 o 3 2 ; 
can be expressed as the infinite geometric sum [1 + 3 2 0 “ ®o) + 
P 4 (l - ®or + •••], which is the present value of receiving one unit of 
account every even period until the game ends. Differentiating V 2 
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with respect to the structural parameters proves that the value of the 
monopoly is positively related to how widespread the disease is (as this 
affects a), the degree of its intensity (which raises the reservation 
prices <a and p), and the probability of a discovery 6 0 but is negatively 
related to the production costs of drugs w as well as the interest rate 

0~ 1 <l - 3). 

Now suppose (pi, pi) - (w(0, V), p) and (&i, 0 2 ) * (<b(0,1?), 1). In 
this case an unsuccessful monopolist is indifferent between introduc¬ 
ing a fake and not doing so; consequently, the value of the game to 
the monopolist is unaffected by never introducing fakes. Thus V - 
V u as defined by (10), and pi = w(0, V)). When signaling does not 
confer a reputation on a cure more quickly, pricing by a successful 
monopolist that signals is identical to what would happen under the 
second scenario (where no signaling occurs). 

The first scenario mentioned in the Introduction occurs when 
u>(a, V) < p and f = 3. Again, there are two cases to consider. First, 
suppose w(a, V) < u>. In the second (and final) period in which a fake 
is marketed, the uninformed sick are willing to pay up to to for it even 
if all fakes are marketed two periods. From the discussion following 
definition 7, their reservation price in the first period, v(a>), exceeds o>. 
Hence, with w(0, V) < w(a, V) < w < v(o>), it follows that a new drug is 
always introduced irrespective of quality and marketed for two pe¬ 
riods, and at that time fakes are withdrawn. Given ( 61 , 82 ) = (0 O , 8 o) 
and (pi, pi) - (v(cj), «), the value of owning the monopoly is 

V 3 * a|ij(l — 0 o )[v(<d) — w + 0(1 — a)(u > — w)] 

2 - (13) 
+ 0o^v(u>) — w + 0(w - w) 4- " ] ~ Z ~ p ~]|^ ~ p 3 + 0oP 3 )""'- 

To calculate Vs, observe that a0(l - 0o)[v(u>) - w + 0(1 - a)(u> - w)] 
is the probability of discovering a fake multiplied by its market 
value, a00o{v(o>) — iv + 0(u> — u>) + [0 2 (p ~ w)/(l — 0)]} is the 
probability of discovering a cure multiplied by its market value, while 
(1 - 0 s + 8o0 3 ) -1 is the infinite geometric sum [1 + (1 - 8o)P 3 + 
(1 — 0 O ) ? 0 6 + ...]. The interpretation is simitar to that for V?, except 
in this case the monopolist is sampling only every three periods. 

If <i» < w(a, V) < p, drugs marketed two periods or more are priced 
at w(a, V) in the second. This case differs from the one above because 
w < w(a, V) rather than vice versa. Consequently, fakes are with¬ 
drawn with strictly positive probability after one period. The chance 
that a fake is withdrawn from the market after one period induces 
uninformed consumers to revise upward their subjective beliefs about 
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the probability that a drug will work to at least 4>(a, V). Since signaling 
is unprofitable for the successful monopolist, the results of every ex¬ 
periment are introduced to the market regardless of the outcome, at 
consumer reservation price v[w{a, 1?)]. The direction of the price 
change, from v[w(a, V)] to w(a, V), is ambiguous. On the one hand, 
the opportunity cost of marketing a low-quality drug has risen; on the 
other hand, the reservation price falls for any given subjective proba¬ 
bility because of the declining value of acquiring private information. 
Since v(p) is declining in p, there exists a unique number (■ such that 
(j = v((j); since v(p) — a> < p, it immediately follows that <a < % < p. 
One concludes that the price rises if and only if w( a, V) > £. 

The Introduction mentioned that people might acquire informa¬ 
tion by purchasing lower-quality goods on average rather than 
through paying higher prices. The scenario above shows how this can 
happen even before the reputation of a brand has become estab¬ 
lished. For example, although the price of a new brand rises when 4 < 
w(a, V) <p in the absence of signaling, so does the expected quality of 
brands retained because some fakes are randomly withdrawn after 
one period. (As «j» > 0, it follows that 0i < 02-) Indeed, the low in¬ 
troductory price offer does not compensate consumers for expected 
lower quality in terms of its current benefits; that is, u(p t ,9 l )< u(0,0). 
To see this, first observe that 0] = 0 O . Then, from definition 7, notice 
that u[v(p 2 ), 0o] - w(0, 0) = a00[u(O, 0) - u(p 2 , 1)]. Since p 2 = 
w{a, V) < p, it follows from definition 2 that u(0, 0) < u(p 2 , 1). The 
claim is now established because v(p) is decreasing in p. Next period, 
however, uninformed consumers need not take the future into ac¬ 
count to choose optimally: u(p 2 , 0 2 ) * u(0, 0). (The equality follows 
direcdy from definition 6.) 

In this case V must solve 

[1 - p 2 - 0«0 3 - 0 2 (1 - 0 O )(1 - 0) - 0o0(l - a)-'(l - 0)]V 
= apjv[w + a -1 (l - a) -1 (l - 0 )V] - w + ^ ^ — ]• 

(14) 

The left-hand side of (14) is proportional to V, while the right-hand 
side is positive and declining in V, thus guaranteeing the existence of 
a unique solution. (It does not admit a closed form.) 

Lemma 6 summarizes the first scenario. 

Lemma 6 . Suppose t = 3. Then w ( a , F) £ p . Also ( py , fo ) = 
(v[w(a, V) V at], w(a, V) V «»>)• If <«> > w( a, V), then (0|, 0 2 ) = ( 8 o> 0o)- 
If w = w(a, V ), then 0! * 0 O and 0g s 4 >(a, V^). 
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Two hallmarks of previous work on games with incomplete informa¬ 
tion are evident from this analysis, namely, signaling (as analyzed by 
Spence [1973] and others) and randomized revelation (see, e.g., 
Kreps and Wilson 1982a). Granted, neither phenomenon necessarily 
occurs as an equilibrium outcome in this model, in particular, if 
ipup'i) * («>,/>) or (pupz) = (v(<o),<o). Nevertheless, if neither occurs, 
the resulting equilibrium outcome is indistinguishable from that gen¬ 
erated by a similar model in which the monopolist has no private 
information. 

The last assertion is established by briefly considering how the two 
outcomes alluded to above arise in environments in which the monop¬ 
olist has no private information. First, suppose that nobody (including 
the monopolist) knows the quality of a new brand, but everybody 
retrospectively sees the effect of medication on others. Then, in equi¬ 
librium, every brand is introduced at price a>, but only a cure is re¬ 
tained for more than one period; hence (pi, p 2 ) - (to, p). Second, 
suppose that nobody knows the quality of a new brand and that the 
only way the monopolist can determine its quality is via inference 
from aggregate sales figures; nevertheless, as in the original model, 
sick uninformed consumers taking medication simultaneously be¬ 
come informed. Then in equilibrium every brand is marketed for two 
periods, so (p u p 2 ) - (v(co), to). 

The notion that randomized revelation is important in games of 
incomplete information is further bolstered by considering a third 
alternative assumption about the structure of information. Let every¬ 
one be initially uninformed and suppose that the monopolist (but not 
healthy people) observes the effect of the drug on those who are 
treated. One can show that in equilibrium all drugs are introduced, 
but whether low-quality brands are withdrawn or not depends on the 
value of w( a, V); thus a role for the opportunity cost of marketing 
fakes reappears in the analysis and with it the possibility of ran¬ 
domized withdrawals. 

IV. Conclusion 

This paper does provide a new context in which signaling may oper¬ 
ate, namely, as a self-regulating device in a dynamic system. However, 
its main contribution is to suggest what the alternatives to signaling 
are and when they might arise. In particular, it explicitly models the 
production technology, individuals’ preferences, and their informa¬ 
tion sets in order to investigate the intuitively appealing idea that 
knowledge may diffuse throughout the population over time. When 
signaling is not an important factor in marketing new products, peo- 
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pie try out a succession of them until a significant discovery is made, at 
which time the pace of research slows down substantially. In such 
cases the first group of buyers to try each new product pays more than 
later groups (per unit adjusted for expected quality) because they 
anticipate benefiting from private information acquired through con¬ 
sumption. The cycles that are generated look like faddish behavior. It 
would, however, be a mistake to conclude, after observing market 
aggregates in this environment, that people behaved whimsically or 
differed in their preferences over goods, their attitudes toward risk, 
or their ability to process information. 


i 


s' 


Appendix 


Proof of Theorem 1 

Because lemmas 1—6 show that there are only a finite number of outcome 
paths for sequential equilibrium that meet condition 1 , it suffices to propose 
an assessment A as a candidate and then verify that A is a sequential equilib¬ 
rium satisfying condition 1. Following notation in the text, let t denote the 
time the most recent experiment was undertaken; that is, b u = l‘,Zbb, ~ 1 • Fur 
expository purposes, A is partitioned into four cases. When case a applies, 
quality is inferred from past prices and quantities traded. If b or c holds, the 
beliefs of the uninformed depend on the unsuccessful monopolist's opportu¬ 
nity costs and the previous prices charged. The last case, d, deals with behav¬ 
ior when everyone is uninformed. To economize on notation, let i|i( 0 \ 0 ) be 
the probability of withdrawing a low-quality brand, which induces unin¬ 
formed Bayesian agents to revise their beliefs upward from 0' to 0: 


«l»(8'. 0) 


f 1 - 0'(1 - 8'r‘(l - 0)0'' if 0’ < 0 

{ 0 if 0 ' as 0 . 


(Al) 


In this assessment, p(A', c,) does not depend on c„ so A (A', c,) is expressed as 
A(A') throughout, without creating ambiguities. Likewise, since periods are 
dated by the age of the current brand, the t subscript on c, is redundant and is 
therefore dropped from now on. The four cases are now given as follows. 

a) Suppose p, s p and A, > 0 for some s < t. With definition 3, it follows that 
q, = aA,c + a(l - A,)y,, and hence c = [ 9 , - (1 - a)A,-y,]/aAj. Suppose 
c = 1 ; then the game structure implies that the triplet ( 8 „ y t > pi) constitutes 
A(A') for any assessment A. Accordingly, set A (A') » (I, 5 (p), p). Suppose 
c = 0; then set A (A') = (0, 0, p, 1). 

b) Suppose that A< > 0 and assume, for all s < l, that if A, > 0 then p s > p. 
Also assume that if A, = 0 and A, +1 > 0 then p, > p 0 . There are three 
subcases. First, suppose that p < w(A„ V); then set A(A') = (1, 8(p,), p, 1). 
Second, suppose that w < w(A„ V) < p; then set (4>(A,, 1^), 1, w(A t , V), »|»[0 t - 1 , 
<b(A„ ?)]) if p, s w(A„ 1?) and set A (A') = (4>(A„ V). 0, u»(A„ 9), t|»[0,_i, 4>(A„ 
^)1) if pi > w(A„ tf). Third, suppose that u>(A„ V) < < 0 ; then set A (A') = (0o. 1, 
< 0 , 0) if p, -s w, and set A(A') = (0o, 0, w, 0) if p, > u>. 
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c) Suppose that A, > 0 and assume, for al! s < t, that if A, > 0 then p s > p. 
Also assume that if A, = 0 and A 1+ i > 0 then p s as pa- Then set A{h‘) = 
(1, &(p,), p, 0) if w(At, V) ss p and set A{h‘) = (1, h(p t ), p, 1) if w{A h $) > p. 

d ) Suppose A t — 0. Set h, * (1, \,po A ui(0, V), 1) ifp t ^po Aui(0, handset 
A(h t ) * (0, 0, po A a>(0, V). 1) if p, > p Q A u/(0, V). 

The interested reader can verify that, under assumption 1, A is a sequential 
equilibrium satisfying condition 1. Q.E.D. 

Proof of Lemma 1 

Suppose 4>i = 1. Define dates r and s such that r < s, y T - y s - 1, and 
lYi = 2, By lemma 2, proved independently, p, as p. Therefore, since % - 
1 and consequently A, > 0, lemma 3, proved independently, implies tSj + 
1. Necessary and sufficient conditions for an unsuccessful monopolist not to 
defect by introducing a fake are 

a0 r (/>, - 3 r+ ’W (A2) 

a0 r (/>, -v>) + a(l - a)P(p, - w) s (0 r - 0 i+l )V. (A3) 

The first inequality, (A2), ensures that an unsuccessful monopolist does not 
withdraw his fake after one period of sales in r; the second, (A3), ensures that 
it is not profitable to market a fake in both periods. Subject to (A2) and (A3), a 
successful monopolist chooses r, s, p T . and p it where 0 < r < s and p,—p and 
p j p, to maximize 

ft ,+ i (p - w) 

ofiTp, - W ) + a&(p, -w)+ Z (A4) 

1 ~ P 

By inspection, the monopolist optimally sets r = 1 and s = 2. There are two 
cases to consider, depending on whether (A2) or (AS) is binding. When (A2) 
is solved with equality, pi = w( 0, K) (see definition 4). Then if ui(a, V) 2 : p, it 
follows that ap[u>(0, V) — w] + a(I - a)0 2 (p — w) is less than (0 — 0 Z )K, 
which implies that (A3) is met with pz — p. Alternatively, assume that 
a>(a, V) s p; setting p t = p 0 and pz - p solves (A3) with equality while (A2) is 
automatically satisfied. Theorem 1 shows that this outcome can be supported 
as a sequential equilibrium satisfying condition 1; hence the first part of the 
lemma is proved. 

Now suppose that <jq > 0 and t > 2. Define the dates r and s as above. Then 
inequalities (A2) and (A3) must be satisfied; otherwise it would be suboptimal 
for an unsuccessful monopolist not to introduce a fake. From the previous 
paragraph, given (A2), (A3), and r < s, the value of the game is maximized by 
the successful monopolist if it chooses {pi, p%) - {po, />)• Hence f — 2. The 
second sentence in the lemma follows immediately. Q.E.D. 

Proof of Lemma 2 

Along the equilibrium path 8, is nondecreasing in t since only fakes are 
withdrawn and beliefs are structurally consistent. Therefore, from definition 
4, 8, = 1 for all (2f. Hence y{h‘, p,) = 0 if p, > p for Uf. 

Accordingly, suppose that h‘ E H‘ and p, > p for some l < f. Consider an 
uninformed sick person optimizing his discounted expected utility over the 
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duration of the current component game. Assume that his equilibrium action 
is to buy the product; consequently he becomes informed and hence A, > 0 
for all s > t. if p s > p, then i(p s ) - 0. But if p, s p for some s > t, then by 
lemma 3 (which is proved independently), f ;s s + 1. Thus a sick uninformed 
person, whose equilibrium action is to take medication in period t, repeats his 
purchase at most once before a cure's reputation is established and then only 
if he falls ill on date f - 1 and p 7 -\ s p. Therefore, since «(0, 1) = 0, his 
expected utility at time / until the end of the current component game is 

♦ - 1 s 

pu(p» 0.) + au(0,0) X + (1 - n* 0 - 

l-«+l L r-»+l 

+ i) - u(o, o)]. 

Now consider the following defection. The consumer does not buy the prod¬ 
uct until t — 1 (provided it lasts that long), and then only if he falls ill. The 
expected utility, calculated at t, is 

f- 1 s 

0‘«(0,0) + «u(0,0) £ + (1 - 0,) n 

J -/+1 L r -/+1 

t - 1 

+ a(l - 0 ,) n a- 0 ) - 

r = / +• 1 

+ D- «(0,0)]. 

Subtracting (A6) from (A5), one obtains 

1 

P'NA.»r) - u( 0 , 0 )] + «{1 - 0 ,) n 0- i)P , ‘ 1 [«(0,0) - u(p r . lt 0 )] 

r*l+ 1 

- i) + (i - e,)p'«(p„o) - p'u(o, 0) 

+-1 

+«d - o,) n 0 - ^)p ,_, wo.o) - u(^-.,o)] 

r»f + 1 

<<1 - e*)PW<.0) - u(0,0)] (A7) 

t -1 

+ «(1 - «,) n U- ^)P ,_, [u(0,0) - u(p T _), 0)] 
r~t +1 

=s(i - •i>["(*-i.0) - u( 0,0)] 

t -1 

+ a(l - 0 ,) f] 0 - 4 »r)P + "‘[a( 0 , 0 ) - u(p 7 - 1 , 0 )] 

1 

[ t- 1 

1 - a f] (1 - fc)P , ~ l |M0.0) - u(p»-i, 0)J. 


(I - *r)l 

(A6) 

u(0, 0)] 


(A5) 
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The first equality in (A7) expands u(p„ 0,) using definition 1; the next lint 
follows because u(p„ 1) < u{0,0) (by hypothesis); the third is a consequence r 
the hypothesis that p^-, sp < p,. Collecting terms establishes the bottorr 
equality, which, by inspection, is negative. So from (A7) one deduces that th« 
hypothesized equilibrium path requires an uninformed consumer to mov: 
nonoptimally, implying that the contrary hypothesis is false. Q.E.D. 

Proof of Lemma 3 

Suppose A, > OantJA s p- Then q, - ot[y,(l - A,) + C/A,)orc/ = Ar’[“ _1 9« ~ 
•fyO - A,)]. If c, = 0, then y(h s , p s ) = C) for all s > t. Therefore, the preseni 
value of the firm at date t + 1 is 

* i — 1 

v x I n ^ - *4 

,«/+i L*-/+i J 

This is maximized by setting 1 = 1. H e, = 1, then 0,,) = 1. Hence p ,+1 = ; 
and y(A ,+ p) - 1, which, from definition 3, yields the result. Q.E.D. 

Proof of Theorem 2 

This proceeds in four steps. The first step shows X*. t ' y, s* 2. The seconc 
step shows XJ, i 1 y, as 1. Putting the two statements together, one obtain: 
Z**i l y, <5 (1, 2}. For notadonai convenience define y T — (yi,.... y T _i). Th» 
third step shows that if X? Z\y, = 1 then y 7 - (1); similarly the final step show: 
that if X?J| l 7, = 2 then y T = (1, 1). 

First, the theorem implies XJ>h y, £ 2. Consider the alternative hypothesis 
that there exist dates r < s < t £ t — 1 such that y r = y, = y, = 1. Then A, > ( 
since y, - 1. Furthermore, lemma 2 implies />, s /> because y, - 1. Hence s = 
t - 1 by lemma 3. Consequently, t — 1 < t. This inequality contradicts the 
alternative hypothesis. Therefore, X^i 1 y, ss 2. 

Second, the theorem also implies X^y, & I. Clearly, f > 1, otherwise ar 
unsuccessful monopolist would invariably introduce its fakes at price p and ii 
would not be optimal for sick people to take treatment. Accordingly, considei 
the alternative hypothesis that X^i 1 y, — 0 for some f > 1. Then the currem 
value of an unsuccessful monopolist at date 1 is 

* r i - 1 , 

- **> ■ 

3 “o L*»i J 

To optimize, 4*i — 1 - But lemma 1 shows that if «|»i = 1 then y\ — 1-There¬ 
fore, y, > 0, contradicting the hypothesis. 

The two paragraphs above imply XJ.T /y, £{1,2}, and these two possibilities 
are now considered in turn. Third, suppose = 1. Then y T ~ (1) oi 

y T = (0,..., 0, 1) or y T = (0,..., 0, 1,0,..., 0). Suppose % = (1,0.0) oi 

y T = (0.0, 1,0.0), where y, = I (i.e., the nonzero element occurs in 

the fth place in y’). Again the value of the game to the unsuccessful monopo- 
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list at the beginning of the next period is maximized by setting = I; 
hence 0<+i * 1 and 7(A' +1 , p) - 1. Therefore, < = f — 1, contradicting the 
hypotheses that y T = (0 . 0 , 1 , 0 , .... 0 ) and that 'y' = (1, 0 ,. . . , 0 ). 

Now consider the hypothesis that y' = (0,..., 0, 1). By lemma 1, = 0. 

This implies p* ~ ‘[a(p T _, - w) + fiV(A)] > V(A); therefore, [a(p T _ 1 - ui) + 
pVo(^)] > V(A) for all s < f — 1 (since P < 1); consequently, = 0 for all 1 s 
iSf - 1. Therefore, (0 T _ 1, p T -i) = (0o. *»)• Also, ui(a, V) > p (otherwise it 
would not be optimal for an unsuccessful monopolist to withdraw the fake in 
period t). The third step is completed by showing that there exists another 
sequential equilibrium assessment, denoted by A, that satisfies condition 1 
and is more profitable to a successful monopolist than A. This proves that A 
does not satisfy condition 2 and hence, by a contradiction argument, y T ¥= 

(0.0,1). For convenience, let A (A 5 , A') denote A (A 1+,_1 ) if A 1 comprises the 

first (s - 1) elements of hf +, ~ 1 and hf the final (t - 1) elements. Given A T ~ 1 €E 
H r ~\ define the assessment A by setting A (A') = A(A T_ \ A') for all A' 6 H‘ and 
all t. The key difference between the two assessments is that in A, but not A, 
the drug is marketed as soon as it is developed. It is straightforward to check 
that since A is a sequential equilibrium satisfying condition 1, so is A. But 

V' = afi[(td - w) + p(p - w)(l - P)-’] 

> -■‘[(a. - w) + B(p - w)(l - P)''] (A8) 

= V\ 

Thus a successful monopolist prefers A to A. Hence A does not satisfy condi¬ 
tion 2. Therefore, = 1 implies y T = (1). 

Fourth, suppose X/Ji 1 y, = 2. By the same argument that was used in the 
third step, 71 = 1. Therefore, y T = (1, 1) or y T = (1,0,..., 1). Suppose that, 
contrary to the theorem, y T = (1,0,..., 1). Now if a successful monopolist 
charged p s p in period 2, the a 2 informed sick would purchase the cure, and 
by lemma 3, its reputation is established in period 3. So to prevent the monop¬ 
olist from defecting from A after discovering the cure, its value at the begin¬ 
ning of period 2 under A must be at least a {a (p — w) +■ [p (p — u>)/(l - P)]}. 
Moreover, since f > 3 by hypothesis, the argument above shows pz> p and 
hence fo = 0. Therefore, an upper bound on the value of owning a successful 
monopoly under A is ap(/i - w)/(l — P), contradicting the lower bound 
derived above. Hence = 2 implies y T = (1, 1), as claimed. Q.E.D. 


Proof of Lemma 4 

The formula for <j» ( follows directly from the definition of structural consis¬ 
tency. Consider any sequential equilibrium A satisfying condition 1. Suppose 
that there exists some h‘ E H‘ such that if c, - 0 then p, = p', but if c, = 1 then 
Pt = p", where/*' v 4 p". A necessary condition for beliefs to be consistent is that 
0» = 0 if p t = p'. In this event s f, = 0 for all j s / satisfying £*_o A, = 0. The 
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value of an unsuccessful firm is thus maximized by setting = 1- Hence p" = 
p and p) = 1. But this implies that it is profitable for the monopolist, if 
unsuccessful, to defect from A by charging p in period /, thus upsetting the 
equilibrium and hence contradicting the conjecture. Q.E.D. 

Proof of Lemma 5 

Suppose, to the contrary, that w( o, Vj < p. The value to an unsuccessful 
monopolist from defecting by not withdrawing the fake after one period is 
a(l - a)(p - u>) + 0 V, which is greater than a(l - u)[u>(a, V) - u>) + 0V, 
But definition 4 implies 1 ? * a(l - a)[w(a, P) ~ ur] + { 3 F. Hence it is not 
optimal to withdraw a fake after one period, contradicting the premise that 
r = 2 . Therefore, w(a, V) > p as claimed. Since §1 ss 6«» the firm can sell its 
drug to the sick for at least w. First, if to > u/( 0 , V), it follows that vpj = 0 . 
Bayesian consistency requires 61 =80. and hence the reservation price of the 
uninformed <0 is charged, as claimed. Second, suppose u/( 0 , 1 ?) > < 0 . Since 
•yi = I, it follows that u(p\, 81 ) 2 u( 0 , 0 ). Also u»(a, V)<p implies pi 2 w(0, V); 
since w( 0 , V) > to, it follows that 0 i > 8o- Consequently, 1 > tpj > 0 , which 
implies pi = ui( 0 , V). Q.E.D. 


Proof of Lemma 6 

If u/(a, V) > p, then an unsuccessful monopolist would optimally withdraw its 
brand after one period, contradicting the premise that f = 3. Therefore. 
u/(a, V) £ p. Also lemma 1 implies spi = 0; hence consistency of beliefs re¬ 
quires 0i = 0 O . Then p\ = v(p 2 ) and, by arguments similar to those given in 
the proof to lemma 5, p 2 = w(a) V «*>, and <0 > w(a, V ) implies 0i = 8 0 , while 
to < u>(a, V) implies 81 2 tj>(a, V). Q.E.D. 
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Unlimited Liability as a Barrier to Entry 


Jack L. Carr and G. Frank Mathewson 

University of Toronto 


Many but not all firms have the freedom to choose liability rules. In 
some countries, service prof essions have unlimited liability rules im¬ 
posed by government; historically, banks in some countries faced 
unlimited liability. Why do governments impose unlimited liability? 
This is the question we address. With a simple model, we illustrate 
the agency conflicts in firms. Limited liability solves these conflicts 
efficiently. Unlimited liability raises the cost of capital; inefficiently 
small firms result. But under some conditions, selectively applied 
unlimited liability rules protect rents. We test several propositions 
with data on Scottish banking and U.S. law firms. 


I. Introduction 

Why are publicly mandated rules leaving owners’ liability unlimited 
imposed on selected industries? Although the literature contains 
many recent examinations of the nature of the firm under different 
liability regimes, little of policy relevance emerges from the discus¬ 
sion. 1 The most efficient policy would be to allow freedom of choice 
on liability—to permit the market to choose the most efficient rule. 


We wish to thank Tom Borcherding, Henry Butler. Sherry Glied, Arthur Hosios, 
Yehuda Kotowitz, Henry Manne, Peter Pashigian, George Stigler, Michael Trebilcock, 
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Chicago, the Connaught Fund for Legal Theory at the University of Toronto, and the 
Social Science and Humanities Research Council of Canada. 

1 This extended literature includes Manne (1967), Posner (1976), Meiners, Mofsky, 
and Tollison (1979), Halpem, Trebilcock, and Turnbull (1980), and Easterbrook and 
Fischel (1985). 
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Currently, in many jurisdictions, most firms have the freedom to 
choose liability rules. This freedom, however, is not universal; some 
firms have no choice. For example, in service professions such as 
accounting, law, and medicine, owners of firms typically are forced to 
accept unlimited liability. There is another example. In Scodand until 
the middle of the nineteenth century, any bank could issue notes. 
Only three banks (located in Edinburgh) had limited liability in their 
charters; all others until 1879 faced de facto unlimited liability for 
note issue imposed by Parliament. Why should any jurisdiction man¬ 
date a liability rule? What are the effects of these rules? These are the 
unanswered questions that we address. 

An understanding of the relative efficiency of alternative liability 
rules sheds no light on these questions. One potential explanation lies 
in an externalities argument: a public-interest approach. Suppose 
that consumers of a product are inadequately informed about its 
potential risks. The market failure argument maintains that un¬ 
limited liability provides protection to the uninformed consumer: An 
increase in potential liability to the supplier reduces the expected 
return from fraud by increasing the penalty if fraud is detected and 
therefore deters chiseling on product quality. The net result, accord¬ 
ing to this argument, is a reduction in the risk to the consumer from 
misleadingly low quality products or services. Similar arguments 
maintain that unlimited liability provides needed protection to the 
uninformed creditors of firms: Knowledgeable managers or owners 
cannot exploit for their personal advantage uninformed suppliers of 
financial capital (debt) or other inputs. 

Externalities that yield risks to unwilling third parties motivate an¬ 
other public-interest defense of unlimited liability. Consider banks 
with limited liability to their depositors, and suppose that a single 
bank is mismanaged. Eventually depositors discover this mismanage¬ 
ment and the bank incurs a run on its deposits. If this event yields a 
reduction in the depositors’ confidence in the entire banking system, 
bank runs will be widespread, with potentially disastrous conse¬ 
quences for the banking system. Some argue that unlimited liability 
maintains depositors' confidence in banks, eliminates systemwide 
runs, and preserves the stability of the banking system. 2 Critical to this 

2 The externality argument for runs on banks has been widely accepted. Friedman 
and Schwartz (1963) use this argument to justify deposit insurance. Adam Smith 
([1776] 1937, p. 313), on the other hand, disagreed with the externality argument and 
urged free banking as the best policy of protection against runs. 

The late multiplication of banking companies in both parts of the united 
kingdom . . . instead of diminishing, increases the security of the public. It 
obliges all of them to be more circumspect in their conduct... by not extend¬ 
ing their currency beyond its due proportion to their cash, to guard them¬ 
selves against those malicious runs.... By dividing the whole circulation into a 
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argument is a belief that bank creditors face sufficiently high mar¬ 
ginal costs of information about specific banks that they are unable to 
distinguish firm-specific and industrywide shocks. 

In our view, these public-interest arguments are incorrect and are 
refuted by the available data. In particular, we advance a private- 
interest explanation for mandated unlimited liability rules: Such rules 
reduce the ability of firms to enter the capital market, increase costs, 
and reduce competition; as such, unlimited liability facilitates local 
monopolies, protecting the rents of these firms. At one level of gener¬ 
ality these results are not surprising; selectively applied liability rules 
invariably constitute barriers to entry. 

In Section II, we present a simple model of a bank that permits an 
examination of limited liability in financial institutions faced with a 
standard agency problem: managers in financial institutions who pos¬ 
sess superior information on investment alternatives may use that 
information to enhance their wealth at the expense of equity holders 
and depositors (creditors); individual shareholders in unlimited liabil¬ 
ity firms are exposed to the risk that in bankruptcy they must com¬ 
pensate for other shareholders who cannot meet their additional obli- 
gation. Does the shifting of bankruptcy risk to depositors through 
limited liability yield inefficiencies in such institutions? The answer is 
no; the results are quite the opposite. Why then do such rules exist? 
Our answer is rent protection. 

Two sets of data test our hypothesis. Section III tests our predic¬ 
tions on firm size using eighteenth-century Scottish banking data. 
Section IV extends our model to a service industry such as account¬ 
ing, medicine, or, in particular, law. Lawyers in partnerships have 
ownership rights on the residual flow of revenues from other lawyers 
in the firm. Under unlimited liability, these ownership rights are tied 
with additional obligations for payouts in malpractice suits. In gen¬ 
eral, this reduces the relative attractiveness of these rights, constrains 
firm size, and increases prices for legal services. We test the effect of 
limited liability on the size of law firms. 

II. Model 

Before we develop a model of a simplified depository institution, we 
set forth our general results and their economic intuition. Why 


greater number of parts, the failure of any one company .. . becomes of less 
consequence to the public. ... In general, if any branch of trade, or any 
division of labour, be advantageous to the public, the freer and more general 
the competition, it will always be the more so. 

The same argument was used in England during debates on the Limited Liability Act 
of 1855, when opponents argued that limited liability, by reducing security, would 
endanger the reputation of British merchants (see Shannon 1931, p. 285). 
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should liability rules affect resource allocation? The Coase theorem 
informs us that in the absence of transactions costs, liability rules are 
irrelevant. 

Our model is driven by the effect of unlimited liability on the cost of 
capital, sometimes labeled the secondary markets effect. This effect 
refers to the increased cost of ownership rights in unlimited liability 
institutions: With unlimited liability, either ownership rights contain 
costly covenants that restrict the transferability of these rights to indi¬ 
viduals with wealth sufficient to meet their share of any liability accru¬ 
ing from insolvency or shareholders must engage in costly monitoring 
to verify the liquidity of their ownership partners. In either case, 
these ownership rights are less attractive than their limited liability 
counterparts . 3 

The economics is most precisely illustrated in a simple model. The 
sequencing of the contracts is important. Equity holders in depository 
institutions have sufficient wealth to permit them to hold ownership 
rights in firms and a comparative advantage at writing labor and 
deposit contracts. Equity holders write contracts with managers and 
with depositors (creditors). Because direct monitoring by depositors is 
prohibitively costly, they rely on the contracts written by owners plus 
the magnitude of the equity interest in the firm to induce honest 
managerial behavior. Equity capital is raised at the outset. Next, man¬ 
agers are recruited and sign employment contracts with the owners of 
the firm. At this point, the financial institution opens its doors to 
accept deposits. In the usual fashion, temporal consistency means that 
contracts can be viewed recursively. 

One moral hazard problem flows from the informational advantage 
possessed by the managers in the firm. Managers know their own 
input levels. Others are incapable of easily sorting out from returns 
the state of the investment market and the effort of the managers. A 
second moral hazard problem surrounds the incentives for owners to 
enforce managerial contracts that protect the interests of depositors. 
A third moral hazard problem present under unlimited liability flows 
from the ability of some shareholders to free-ride on the wealth of 
others who provide the guarantee of additional payments to cover 
insolvencies. 

Investigation of these effects requires a simple illustrative model. 
We consider, in turn, limited and unlimited liability and ask how the 
equilibrium changes between these two regimes. Under limited liabil¬ 
ity, by definition, equity holders know that the limit of their wealth 

* A similar observation was made by John Stuart Mill (1850, p. 169): “1 think that the 
great value of a limitation of responsibility, as relates to the working classes, would be 
not so much to facilitate the investment of their savings, not so much to enable the poor 
to lend to those who are rich, as to enable the rich to lend to those who arc poor.” More 
recently, Woodward (1985) has examined the transactions motive for limited liability. 
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commitment to the firm is their ex ante investment in the firm’s equity 
capital. Under unlimited liability, equity holders may be liable ex post 
for the firm’s debt. 

Due care by managers is fixed and normalized at one with an op¬ 
portunity cost of w. Equity holders collectively invest e in each firm 
and write the managerial employment contract. For simplification, we 
assume that depositors deposit $1.00 in each financial institution; 4 in 
equilibrium, all institutions offer the same risk-corrected rate of re¬ 
turn on each deposit. The total pool of resources available for invest¬ 
ment by each firm is then $( 1 + e). These resources have an opportu¬ 
nity cost of investment in physical capital that has a marginal return of 
q. There are two states reflecting the returns on investment. State 1 
denotes “bad times” or a low return; state 2 denotes “good times” or a 
high return. The return in each state is defined as n < r 2 , and the 
corresponding probability of states is defined by 0 and 1-0. 

Consider first the conventional moral hazard problem between 
managers and owners. Managers knowing their own input can cost¬ 
lessly infer the investment state ex post. In the absence of monitoring, 
equity holders and depositors cannot separate managerial inputs 
from financial returns, a standard moral hazard problem. These re¬ 
turns depend jointly on two unobservables, the investment state and 
the manager’s actions. This informational advantage affords man¬ 
agers an opportunity to chisel equity holders and possibly depositors. 
Specifically, managers have the option of reducing their input below 
that anticipated by the equity holders. If state 2 is still drawn (for 
simplicity we ignore the impact of reduced managerial efforts on the 
likelihood of state 2), the managers could announce state 1, pay a 
return to depositors consistent with this state (ri), collect a salary equal 
to w, and supplement this by reselling on an external labor market the 
reduced input y (1 > y > 0), where y is defined by r 2 (l - y) 35 rj. 
Doing so even without monitoring has one risk for managers: should 
state 1 occur and the realized investment return be rj(l - y), equity 
holders would immediately detect chiseling and fire the managers. 

We assume that managers can neither post performance bonds nor 
become equity holders to internalize this agency problem: either man¬ 
agers have limited wealth or these financial institutions themselves 
cannot write loan contracts that dominate their employment con¬ 
tracts. This rules out owner-managed firms as the efficient solution to 
the managerial moral hazard problem. Furthermore, we assume that 
the comparative advantage for writing employment contracts (be¬ 
cause of transferable rights) rests with equity holders and not depos- 

11 This means that deposits are inelastic with respect to the relative efficiency of 
organizational arrangements. In particular, variations in equity in the firm under this 
condition are equivalent to variations in the debt-equity ratio. 
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itors. Equity holders have at their disposal two instruments to control 
potential managerial chiseling- First, they can monitor managers 
through audits; monitoring measures and verifies managerial inputs. 
Second, equity holders can offer managers salary premiums. (This is 
similar to the contracts in Becker and Stigler [1974] and Shapiro and 
Stiglitz [1984].) With strictly positive marginal costs of monitoring, 
wage premia are always strictly positive. We define p m to be the fre¬ 
quency of managerial monitoring. 

Managers caught chiseling, whether through monitoring or low 
financial returns, are fired and forfeit all wages. (Punitive damages 
levied on managers that chisel alter slightly the algebra but not the 
economics of our results.) To promote honesty, in the absence of any 
signs of chiseling, managers receive a simple salary bonus, denoted by 
P, to be not less than the actuarially corrected return from chiseling: 
P a @(p m )yu>, where @(p m ) = 1/[1 - 8(1 - p m )] with 0' < 0 and 0" > 
0. The firm has a managerial requirement relation given by wi(l + e) 
with m', m" > 0. Increases in equity require an increasing managerial 
staff to administer the corresponding investment portfolio. This rela¬ 
tion serves to limit the size of the firm. 

Next, there is the possibility that owners and managers may con¬ 
spire jointly to chisel depositors. Under competition and with per¬ 
formance, equity holders make a competitive return. Then, any joint 
conspiracy is successful if the equity is large enough that its opportu¬ 
nity cost plus that of the managers exceed the expected premium 
collected by managers if they chisel together with the return from the 
sale of their chiseled output. We assume that this condition holds. 

Equity can be raised, managerial contracts can be written, and de¬ 
posits can be accepted under two alternative liability regimes. Our 
central hypotheses flow from our contention that the nature of these 
liability regimes significantly alters the nature of the firm. 

A. Liability Rules 

We assume that bankruptcy invariably arises in the low-investment 
state. We also assume that both the limited and unlimited liability 
contracts generate sufficient revenues under (possibly constrained) 
profit maximization that managers’ salaries and monitoring expenses 
are covered. The liability rules of interest to us are in place to protect 
the payments to general creditors (depositors) of the firm, not these 
other factors of production. In our subsequent banking application, 
both unlimited and limited liability firms coexist. If our contention 
holds that limited is more efficient than unlimited liability in banking, 
only the constrained supply of limited liability charters prevents lim¬ 
ited liability firms from dominating the market. Furthermore, under 
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competition, expected returns to depositors must be identical be* 
tween the two types of organizations: any rents from limited as op¬ 
posed to unlimited liability must accrue to the shareholders of limited 
liability firms, the owners of the scarce charters of incorporadon. 

We describe recursively the contracts written under each of the 
liability regimes and synthesize them in one model. Under limited 
liability and our assumpuons, depositors receive a return defined by r 
should state 2 occur, and they become residual income recipients 
under bankruptcy when state 1 is drawn. 

With mandated unlimited liability, equity holders’ personal wealth 
guarantees a return to depositors. Under the assumptions of our 
model, depositors call on this insurance option under the most ef¬ 
ficient (constrained) contract in state 1, which directly increases the 
cost of equity capital and reduces its liquidity. How is this solvency 
constraint formulated? For purposes of illustration, imagine that 
ownership rights in the firm are held equally by / shareholders and 
focus on the additional insurance exposure for each shareholder be¬ 
cause of the inability to write complete ownership contracts with other 
owners (J is assumed to be larger than one but exogenously given). 
Denote each shareholder’s wealth as W, and the additional total 
wealth needed to guarantee the firm’s solvency (including the oppor¬ 
tunity cost of capital) in state 1 as W°. Consider the /th shareholder 
and define Wj * S/jV W } . Under joint and several liability, the /th 
shareholder’s liability is either W°/J if (/ - l)W°/J <Wj (which we 
assume occurs with probability a, 1 > a > 0) or (W 0 //) + [(/ - 
1)W 0 //] — Wj if (/ — 1 )W°/J > Wj (which occurs with probability 1 - 
a). The perceived liability at the time of bankruptcy for the J th share¬ 
holder is B°IJ, where B 6 a aW° + /(1 - «){[(/ - 1 )W°IJ] - W } }. 
Symmetrical expected liabilities face the other shareholders. 

Under unlimited liability, shareholders, however, have every incen¬ 
tive to engage in costly search to verify the wealth of other actual and 
potential shareholders. We define p, to be the frequency of this 
search; search increases the probability that each shareholder has 
sufficient wealth to meet any subsequent call for cash subsidies in the 
case of bankruptcy. (Explicit bond posting by shareholders may re¬ 
duce B° — W°. Even this, however, requires costly monitoring.) In 
particular, a = a(ps) with a' > 0 and a(l) = 1. Thus fl°(p v ) * a(Pi)W° 
+ J[\ - a(p,)]{[(/ - 1)W°//J - W,}, where B 0 ' < 0 and B°” > 0. 
Under limited liability, such shareholder monitoring is obviously 
zero. Managerial monitoring costs are incurred per unit of manage¬ 
rial input; shareholder search costs are incurred by each of the/ 
shareholders to verify the wealth of the other J - 1 shareholders. 
The total cost of these monitoring and search activities is [p m m( 1 + t) 

+ JU - DM- 
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We may now assemble these components to define monitoring and 
firm size. Under limited liability, the efficient deposit contract max¬ 
imizes expected profits for the owners of firms (denoted by £11) sub¬ 
ject to the expected return for depositors (denoted by ED) being 
competitive. This deposit contract is defined by 

max £11 *0(r 2 (l + e) + Ml + y&(pj] + p m }m(l + e) - r) 

r 

- JU - Up* - qe 


subject to 

ED ■ 0r + (1 - 0)(r,(l + e) - {w[l + ^©(pm)] + PmM 1 + *)) 


- 9 >0. (1) 

Under unlimited liability, the firm seeks to write an ex ante efficient 
savings contract but now with a solvency constraint. This contract is 
defined by 

max£11 * [0r, + (1 - 0)r 2 ](l + e) - {u>[l + 7®(Pm)l + P*}»«(1 + e) 

r 

- (1 - 6)B°(P,) ~ J(J ~ 1)P, - r- qe 


subject to 


ED - q 2 = 0 


( 2 ) 


and 

r,(l + e) + W° - HI + y0(pjj + pjm(l + e) - B% p,) 

- JU ~ l)fc - qe - r > 0. 

If the respective shadow prices are t) on (1), f; on (2), and X on (3), 
the corresponding (constrained) efficient savings contracts yield n = 

1 for limited liability and fj * 1 - X for unlimited liability. Under 
limited liability, the managerial contract seeks to maximize the joint 
wealth of depositors and shareholders. Under unlimited liability, this 
objective is compromised by the tied sale of a financial return and 
insurance to depositors. 

Substitution of these conditions reveals that, one stage back when 
managerial contracts are written, the sole difference between limited 
and unlimited liability contracts is whether the solvency constraint 
defining W° is binding. To facilitate comparative static comparisons of 
the contracts, define an exogenous variable 8 = 0 (respectively 1) 
when the limited (unlimited) liability contract is in force. With limited 
liability contracts, 8 = X « 0; with unlimited liability contracts, 8 = 1 
and X > 0. For convenience, define 8(£), where 8' > 0. 

The employment contract is defined solely by the monitoring activ- 
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ity, the burden of unlimited liability on shareholders is defined 
through shareholder search, and the size of the firm is defined by the 
equity contract. These are given as the solutions to 

max ER * [(1 - 0)n + 0r 2 ](l + e) 

- Ml + y@(p ffl )3 + P>(1 + e) - J(J - 1)P, 

- 5(X)( 1 - 0)fl°(p,) - q(l + e) + X(r,(l + e) + W° 
-Ml + 7®(Pm)] + PmMl + e) - B% p,) - J(J - l)p s 


~ 7(1 + '*))■ 

The corresponding first-order conditions are 

pm- -1=0, (4) 

Pt : -[8(X)(1 - 0) + k]B °' - J(J - 1)(1 + X) s 0, (5) 

e: (1 - 0 + X)r! + 0r 2 - (1 + X)(M> + 7©(Pm)] + PmW + q) = 0, 

( 6 ) 

X: X(r,(l + e) + W l) - Ml + Y©(pJ] + pJ«(l + <) ~ B°( P,) 
-7(7 - Dp. - 7(1 + e)) = 0. (7) 


B. Interpretation and Comparative Statics 

At this point, interpretative comments are in order. The imposition 
of the unlimited liability regime is treated as exogenous: X is an exoge¬ 
nous variable. Comparative statics (with X > 0) reveal that dp m /dX = 0, 
dp,/d\ > 0, and de/dk < 0. These effects are driven completely by the 
effect of unlimited liability on the value of the ownership rights of the 
firm. 

Equation (4) reveals that, with constant returns to monitoring indi¬ 
vidual managers, the incentive to monitor individual managers is not 
affected directly by the liability regime. Any effect on total moni¬ 
toring comes through the effect of unlimited liability on the value of 
the ownership rights in the firm and therefore the size of the firm. 
This would not be true if managerial effort, which is fixed here, were 
to vary with the liability regime. Then different liability regimes 
would dictate variation in monitoring. Equation (5) reveals that share¬ 
holder monitoring is in the interests of owners of unlimited liability 
firms. Each shareholder in an unlimited liability firm has an incentive 
to search out other shareholders whose wealth limitations will not 
constitute an additional levy should the firm be bankrupt. Equation 
(6) reveals that, for any monitoring effort, the imposition of unlimited 
liability imposes additional costs on equity owners and reduces the 
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value of equity ownership relative to the opportunity cost facing 
equity holders and, consequently, firm size. It is this prediction of 
firm size that is tested with both sets of data at our disposal. 

Given the apparent inefficiency of suppressing the natural selection 
forces of the market, why were unlimited liability rules ever imposed 
by regulation? In our view, the answer must lie in a private rationale 
for regulation, along the lines argued by Stigler (1971) and Peitzman 
(1976). To be viable, unlimited liability rules must protect some rents 
for economic agents; to do so must have political survival value. Why 
rent protection takes this form or what determines its limits, for ex¬ 
ample, the supply of limited liability charters, remains a deeper but 
unanswered puzzle in our current analysis. 

In testing our model, we make two demands of it: First, can it 
predict the impact on firm size and rents between unlimited and 
limited liability regimes? Second, can we offer a consistent interpreta¬ 
tion of the changes in the law permitting limited liability when for¬ 
merly only unlimited liability was permitted? In particular, we use two 
sets of historical data: (i) on the use of unlimited liability in Scottish 
banking between 1795 and 1879 and (ii) on the imposition of un¬ 
limited liability in law. 

III. Evidence: Scottish Banking, 1795-1882 

From 1795 to 1845, the Scottish banking industry was characterized 
by open entry with note-issuing privileges for banks provided the 
entrant’s shareholders accepted unlimited liability. Three banks 
(Bank of Scotland, Royal Bank of Scotland, and British Linen Com¬ 
pany) enjoyed limited liability privileges granted by the Scottish Par¬ 
liament. In 1845, the Scottish Banking Act was passed, which 
severely restricted entry to firms that received charters from Parlia¬ 
ment. Only in 1879 were the limited liability restrictions effectively 
removed from Scottish banks. 5 We argue that unlimited liability re¬ 
strictions on the ability of the majority of Scottish banks to compete 
with the three banks with limited liability privileges were the conse¬ 
quence of private rent seeking. There are two directly testable impli¬ 
cations of our theory: first, limited liability banks should be more 
profitable than the unlimited liability banks, and, second, limited lia¬ 
bility banks should be larger than unlimited liability banks. Condi¬ 
tional on these narrow results, the third testable implication is our 
ability to explain the removal of liability regulation in Scottish bank¬ 
ing. We focus first on the efficiency results. 

4 In 1882, all Scottish banks took advantage of the limited liability provisions of the 
1879 act. 



776 


JOURNAL OF POLITICAL ECONOMY 


TABLE 1 

Number of Scottish Banks for Selected Years 



1772 

1810 

1830 

1850 

1885 

Edinburgh 

21 

13 

12 

6 

5 

Glasgow 

5 

4 

5 

4 

2 

Secondary burghs 

4 

12 

15 

7 

2 

Lesser burghs 

_1 

_8 

_4 

_0 

_I 

Total 

31 

37 

36 

17 

10 


Sot/MCC.—Chcckiand (1975). tables 2. 5, 9, II. 16. 


Testing the first proposition requires profitability data for Scottish 
banks. Such data are difficult to obtain. There is, however, indirect 
evidence supporting our hypothesis. The three limited liability banks 
of Scotland were established over the period 1695-1746. From the 
beginning of this period to 1845, when entry rules were changed, 
over 50 banks failed or left banking. 6 Over this period all three lim¬ 
ited liability banks survived. This observation on survivorship offers 
indirect support for the proposition that limited liability banks were 
more profitable than unlimited liability banks. 

Table 1 presents data on the number of Scottish banks in various 
selected years. All three limited liability banks were located in Edin¬ 
burgh. The total number of Edinburgh banks fell from 21 in 1772 to 
six in 1850. Of the six remaining Edinburgh banks in 1850, three 
were the limited liability banks. This fact again lends support to our 
hypothesis that unlimited liability put banks at a competitive disad¬ 
vantage to their limited liability competition. Faced with a choice, 
depositors continued to support those banks with limited liability. It is 
difficult to argue that Scottish depositors were ill informed and made 
the wrong choice. From 1695 to 1845, of the 30 or so Scottish banks 
that failed, none was a limited liability bank. 

The data on average bank size displayed in table 2 reveal informa¬ 
tion on firm size, both number of branches and average branch size. 
From 1800 onward, the average size of the limited liability banks is 
almost 10 times that of the unlimited banks. The data on branches 
displayed in table 2 indicate the constraints on size incurred by those 
banks with unlimited liability. If unlimited liability increases suffi¬ 
ciently the cost of capital to banks, it would be difficult for these banks 
to branch. (The unlimited liability restriction had an effect similar to 
that of the no-branching provisions of U.S. banking laws.) In 1825, 
limited liability banks had on average about 15 branches, whereas 
unlimited liability banks had only three branches per bank. 

6 Checkland (1975, tables 2, 3, 9, 11, 16) records the historical characteristics of the 
Scottish banking industry. 
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Average Size of Limited Liability and Unlimited Liability Firms and Number 
of Branches per Bank for Selected Years 


-— 

Average of Three 

Average of All 


Limited Liability Banks 

Unlimited Liability Banks 


Average Total 


Average Total 



Assets 

Average Number 

Assets 

Average Number 


(in Pounds 

of Branches 

(in Pounds 

of Branches 

Year 

per Bank) 

per Bank 

per Bank) 

per Bank 

1772 


1.3 


.07 

1802 


12.3 


.86 

1825 


14.7 


2.82 


Soiisce— C.heckUnd (1975), pp. 2S7, 240. 424. 


Free banking in Scotland ended in 1845. (Not only was entry re¬ 
stricted in 1845 but note issue for each bank was restricted effectively 
to an amount equal to its average value for a number of years before 
1845.) As table 1 indicates, there was a steady decline in the number 
of Scottish banks after 1845. Free entry is critical to our model. Con¬ 
founded entry and liability changes after 1845 preclude any test using 
the Scottish bank data after that date. 

The transition to limited liability for the Scottish banks was con¬ 
voluted. While an English Limited Liability Act of 1855 permitted 
English firms to have limited liability, it precluded Scottish firms. A 
companies act of 1862 had a section (sec. 182) applicable to all banks 
in the United Kingdom. In it, limited liability was generally permitted 
but unlimited liability was retained for note issue liability for banks. 
Furthermore, general creditors were ranked ahead of note holders. 
With unlimited liability still in force for note holders, no candidate 
Scottish bank altered its structure. Only in 1879 were the effects of 
unlimited liability substantially lessened for note issue. (Section 6 of a 
revised act of 1879 replaced sec. 182 of the act of 1862.) While un¬ 
limited liability was nominally retained for note issues, other changes 
effectively reduced the liability. If a Scottish bank was closed, the 
claims of note holders were satisfied first and then the claims of all 
other creditors were considered. General creditors were guaranteed a 
sum at least equal to that received by the note holders. By 1885, note 
issue represented 5 percent of total liabilities. Together, this meant 
that there was little likelihood that additional stockholders’ wealth 
would be at risk in the case of bankruptcy. Limited liability was now 
effectively feasible. In 1882, the seven remaining unlimited liability 
hanks adopted the provisions of the 1879 act: Faced with a choice 
between “de facto” limited and unlimited liability, all Scottish banks 
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chose limited liability. Again, the historical facts support our general 
hypothesis of the efficiency of limited over unlimited liability. 

An outstanding issue concerns the political economy rationale for 
the change in liability constraints during this period. If unlimited 
liability imposed on most Scottish banks protects the rents to the three 
limited liability banks in Edinburgh, why eliminate this protection? 
Our answer is consistent with our central hypothesis but speculative. 
In general, political competition guarantees that particular forms of 
protection are replaced by alternatives when relative costs adjust. In 
particular, we speculate that unlimited liability was an increasingly 
costly form of rent protection. If there are economies of scale in 
banking (Benston, Hanweck, and Humphrey [1982] find such econo¬ 
mies over certain ranges of bank size for modern banking institu¬ 
tions), the costs of being inefficiently small (because of unlimited lia¬ 
bility) would increase with the size of the local banking market. We 
believe this to be true for nineteenth-century Scottish banking. Un¬ 
limited liability banks primarily serviced the area outside Edinburgh. 
As the size of the local markets in this area increased, the costs of 
unlimited liability also increased. In 1879 unlimited liability was elimi¬ 
nated. But protection to the original Edinburgh limited liability banks 
continued through the stringent entry restrictions of 1845 and the 
cartel-like arrangement effectively freezing the note issue of the exist¬ 
ing banks. An absolute entry barrier was substituted for the more 
costly unlimited liability entry barrier. We do not know whether these 
more stringent entry conditions compensated the Edinburgh banks 
adequately for the loss of their exclusive liability status. 

How does the public-interest hypothesis fare in the light of these 
data? From a public-interest perspective, the Scottish bank data rep¬ 
resent a difficult (perhaps irreconcilable) puzzle. Why allow three 
banks limited liability and impose unlimited liability on the rest of the 
note-issuing banks? Do the depositors of the three Edinburgh banks 
not need protection? If unlimited liability is needed to maintain 
public confidence in the banking system, why are some banks exempt 
from the very provisions needed for public confidence? Why is un¬ 
limited liability in the public interest until 1879 but contrary to the 
public interest after 1879? 


IV. Unlimited Liability for Lawyers 

Public policy toward industries that provide (sometimes complex) 
professional services, such as accounting, law, and medicine, allows 
associations of practitioners to regulate entry into the respective in¬ 
dustry. Part of the regulated arrangements in these industries in¬ 
volve, historically at least, unlimited liability on the part of the prac- 
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titioner: ex post, the wealth of the supplier of the service is available to 
compensate consumers for damages that flow from insufficient care 
taken by the provider of the service. The conventional argument is 
that unlimited liability is in the public interest because of the disci¬ 
pline, and therefore public benefit, such guarantees impose on the 
practitioner. We argue that, on the contrary, these rules support our 
hypothesis that mandated unlimited liability is the natural consequence 
of rent seeking. First, while reputational signals to consumers may not 
be perfect, some bonding arrangements are possible. Most profes¬ 
sionals incur substantial sunk investments, including not only the in¬ 
vestment in specific human capital required in their training but 
brand names accumulated over a period of years. For example, law 
and accounting firms enjoy reputations for comparative excellence, 
sufficiently so that partnership names are retained even after specific 
individuals are no longer active in firms; medical centers and hospi¬ 
tals build reputations for knowledge on specific illnesses. The return 
on these sunk investments is at risk if due care is not forthcoming. 
Second, if unlimited liability is an essential part of the signaling mech¬ 
anism in the market for professional services, an unregulated compet¬ 
itive arrangement should encourage its provision. 

A modest reinterpretation of our banking model is necessary for 
lawyers, but the same force applies: Unlimited liability by raising the 
cost of ownership rights discourages investment in the firm, causing 
legal firms to be inefficiently smalt. Lawyers in law partnerships act 
not only as professionals themselves but as owners of residual rights 
from the professional efforts of others in the firm. As such, their 
wealth is exposed to call under unlimited liability should bankruptcy 
prevail. Further, any partner is further exposed to the inability of 
others in the partnership to meet their financial obligation because of 
limitations to their personal wealth. Indeed, sudden downturns in the 
wealth of any practitioner unknown to others in the partnership may 
encourage that lawyer as a professional to chisel knowing that he can 
free-ride on his partners’ ability to deal with any solvency problem 
that would result. 

This possibility will encourage partners individually to allocate re¬ 
sources to monitor their partners’ wealth and to take action to avoid 
this contingency. With the marginal cost of such action strictly posi¬ 
tive, each partner will still be exposed to some strictly positive proba¬ 
bility of a call on their wealth in excess of their obligation to the 
partnership if each lawyer were personally solvent. Under competi¬ 
tive conditions, consumers of the legal services of the firm will be 
required to pay prices that fully reflect these contingencies, contin¬ 
gencies that are beyond the expected value of the coverage to the 
client. It is this additional payment for risk-neutral consumers that 
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constitutes the first-order inefficiency to the consumer from un¬ 
limited liability. It is the alteration in the monitoring and investment 
margins from the additional cost of capital to the law firm that drives 
the law firm to be inefficiently small under unlimited liability. 

The concern over chiseling and residual liability arises in those legal 
Cases that are complex, in which the payoff to the client is more 
sensitive to the lawyer’s skill and informational asymmetries between 
the knowledgeable lawyer and uninformed client may be large. Such 
cases may require teams of diversified legal talents as well. In contrast, 
routine cases such as minor traffic offenses or standardized agree¬ 
ments to purchase may offer litde potential for shirking. If limited 
liability were permitted, we would expect to see it exercised for larger 
law firms dealing with more complex cases, perhaps for corporate 
clients, leaving smaller firms as unlimited liability organizations to 
deal with straightforward legal matters. If the more complex cases 
also require the better legal talent, then we would expect to find that 
the return to scarce talents would be larger under limited than un¬ 
limited liability. 

The prediction on firm size, identical to that developed earlier for 
banks, and the prediction on lawyers’ incomes were tested for the 
legal service industry in the United States. Until 1961, all states im¬ 
posed unlimited liability on practicing lawyers. In 1961, Alabama and 
Georgia allowed lawyers to incorporate and obtain limited liability for 
malpractice. While the personal wealth of individual partners was 
exposed to ex post seizure for malpractice by that partner, the per¬ 
sonal wealth of other partners could not be seized: liability in the 
partnership was no longer joint. By 1977, 19 other states allowed 
lawyers to adopt limited liability. 7 Data exist from the Census of Selected 
Services for the years 1972 and 1977 on revenues of lawyers by stan¬ 
dard metropolitan statistical area (SMSA). 

We use average receipts per legal establishment for SMSAs as a 
measure of average size of legal firm. This variable is regressed 
against a set of economic and demographic variables (income, popu¬ 
lation, number of manufacturing establishments, number of retail 
establishments, and number of selected service establishments) as well 
as a dummy variable for liability status of the relevant state. Liability 
status should affect only incorporated law firms. The 1972 census 
alone provides an appropriate breakdown of law firms into sole pro¬ 
prietorship, partnership, and corporate law firms. (In the other years, 

7 These states, shown with year of adoption of limited liability (taken from Cavitch 
[1983, pp. 82-126-82-129]), are Alabama (1961), Alaska (1968-69), Connecticut 
(1969), Florida (1967), Georgia (1961), Hawaii (1969), Iowa (1970), Kentucky (1962), 
Louisiana (1964), Minnesota (1973), Nevada (1963), New Jersey (1969), New Mexico 
(1969), North Dakota (1963), Ohio (1973), South Carolina (1962), Tennessee (1961— 
63), Texas (1971), and Virginia (1962). 
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TABLE 3 

Effect of Limited Liability on Average Size of Law Firms, 1972 


Independent Variable 


Dependent Variable 


Average Size 
of All 

Ijiw Firms 
(1) 

Average Size 
of 

Partnerships 

(2) 

Average Size 
of Corporate 
Firms 

(3) 

Constant 

28.2 

93.0 

-117.1 


(1.7) 

(2.8)* 

(-18) 

Liability dummy* 

10.2 

6.3 

79.) 


(2.0)* 

(.6) 

(4-0)* 

Income 

.016 

.021 

048 


(4.4)* 

(2.8)* 

(3.3)* 

Population 

.00001 

- .000003 

- .00003 

(.6) 

(-.08) 

(-.3) 

Number of manufacturing 

-.007 

-.015 

-.035 

establishments 

(-1.4) 

(-1.5) 

(-1.7) 

Number of wholesale 

.016 

.043 

- .038 

establishments 

(2.2)* 

(2.9)* 

(-13) 

Number of retail establish- 

-.002 

-.001 

.027 

ments 

(-.5) 

(-.1) 

(18) 

Number of selected service 

-.00008 

-.002 

-.003 

establishments 

(-03) 

(-.2) 

(-.4) 

R' 1 

.31 

.33 

.26 

Standard error of re- 




gression 

28.2 

56.1 

108.6 

Mean of dependent variable 

114.1 

211.7 

164.1 

Number of observations 

156 

156 

156 


Soukck. —1972 Census of Selected Sennet*. 

Non.—Data do not include Texa* Mote it switched to limited liability in 1972. HUtistio are in parcmhc«es 
* Significant at the 95 percent level 
a Dummy *» l for limited liability. 0 for unlimited liability. 


corporate law firms are included in professional service organizations 
together with all other types of law firms such as storefront firms.) 

Table 3 reports the 1972 results for all law firms, partnerships, and 
corporate law firms. Table 4 reports the results for all law firms for 
1977. The results from table 3, column 3, show that liability status 
significandy affects law firm size: a change in liability status from 
unlimited to limited increases the average law firm size in 1972 (as 
measured by annual receipts) by approximately $79,000 (in 1972 dol¬ 
lars) or 48 percent. A similar result flows from the 1977 data using 
average firm size over all legal firms as the dependent variable. These 
results are consistent with our prediction on firm size. 

One potential problem with cross-sectional results is that liability 
status may be correlated with some excluded variable. If so, liability 
status in our equation estimated with cross-section data acts as a proxy 
for the left-out variable. While time-series data before and after the 
liability change in each relevant stale would be superior, they are 
unavailable. The results for 1972, however, do provide evidence on 






782 


JOURNAL OF POLITICAL ECONOMY 


TABLE 4 

Effect of Limited Lubiuty on Average Size of Law Firms and Average Skill 

Level of Lawyers, 1977 


Independent Variable 


Dependent Variable 


Average Receipts per 
Legal Establishment 
(with Payroll) 

Average Income* 
per Lawyer 
(with Payroll) 

(1) 

(2) 

(») 

(4) 

Constant 

180.4 

129.2 

46.7 

118.2 


(26.8)*- 

(24.2)* 

(.3) 

(.9) 

Liability dummy b 

18.0 

12.1 

16.9 

16.6 

(2.7)* 

(19) 

(3.0)* 

(3.3)* 

Income 

.003 

- .0003 

.0004 

.003 


(7.3)* 

(-.2) 

(1.4) 

(2.1)* 

Percentage of lawyers 



.61 

.57 

younger than 40 



(1.0) 

(10) 

Percentage of lawyers male 



.05 

- .55 




(.04) 

(-.4) 

Population 


- .0007 


-.008 



(-.4) 


(-1.1) 

Number of manufacturing 


-.01 


.006 

establishments 


(-2.0)* 


(13) 

Number of wholesale 


.03 


-.001 

establishments 


(3.3)* 


(-.4) 

Number of retail establish- 


- .0005 


- .005 

ments 


(-.2) 


(-2.6)* 

Number of selected service 


- .0007 


.001 

establishments 


(-.5) 


(14) 

R 2 

.31 

.42 

.28 

.52 

Standard error of re- 





gression 

37.2 

34.8 

15.7 

13.7 

Mean ot dependent 





variable 

150.7 

150.7 

92.9 

92.9 

Number of observations 

126 

126 

42 

42 


Source. —1977 Census of Selected Sennits. 

Note — Although Hawaii permitted limited (ubi(it> in 1969. a state supreme court decision in 197,1 ended limited 
liability for lawyers. Therefore, Hawaii is omitted. Court derisions also eliminated limited liability lor Ohio in I985f 
and Georgia in 1983 (for professional nutters only). Since these court decisions came alter 1977. they are ignored in 
this table. 

* Significant at the 95 percent level. 

* JiKome is gross billings minus nonpayroll ojjcratnig expenses 

b Dummy » I for limited liability. 0 for unlimited liability. 


this potential problem. Since partnerships are unaffected by rules 
governing liability status (partners in law firms always have unlimited 
liability status), our theory would predict that the liability dummy 
should not affect partnership size. If this dummy were a proxy for a 
left-out variable, it should affect partnership size. Table 3, column 2, 
reports the regression results for average size of partnerships. These 
results show that for partnerships the liability dummy is statistically 
insignificant, added evidence supporting our hypothesis. 

We also test our prediction on skill levels. In particular, average 
rents per lawyer calculated as the average income (net of expenses) 
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per lawyer measure the average skill level of lawyers. We regress this 
skill variable against the same set of exogenous variables used for the 
size regression plus the addition of two variables that appear in tradi¬ 
tional earnings equations, average age of lawyers and a dummy vari¬ 
able to reflect sex composition of the legal labor force. Our expecta¬ 
tion is that changes in liability status by state cause skill levels to adjust 
only slowly. We believe that our 1977 data set is biased toward reject¬ 
ing the null hypothesis that liability status affects skill levels since most 
states changing to limited liability did so only in the mid to late 1960s. 
Table 4 presents our regression results. Column 3 indicates that a 
movement from unlimited to limited liability significantly increases 
average income per lawyer, a measure of average skills. Average in¬ 
come per lawyer in 1977 increases by $16,000 or 17 percent as the 
liability status is changed from unlimited to limited. Again, these 
results are consistent with our model. 

Mandated unlimited liability rules exist to protect the quasi rents of 
inefficiently small law firms. Curtailing the assembly of superior legal 
talent in larger and more efficient law partnerships also protects the 
rents to more modest legal talent. When law firms have the option of 
limited liability, competition would dissipate these rents. In a number 
of states, unlimited liability is still enforced for law firms. An inter¬ 
esting question is why this apparently inefficient arrangement con¬ 
tinues. One explanation is that the efficient size of even limited liabil¬ 
ity law firms is relatively small, and hence the cost of using liability 
rules to protect rents for lawyers is small. (Table 4 shows that for 1977 
average receipts of law firms were approximately $150,000 per year.) 
The final equilibrium of the legal status of liability for lawyers across 
states remains yet undefined, but a question is raised. What accounts 
for the change in liability status of law firms? Could it be that those 
states with relatively large efficient law firm size have already elimi¬ 
nated unlimited liability? 


V. Conclusions 

Liability rules selectively applied to firms supplying certain products 
and services serve the private interests of some firms and not the 
public interest of protecting uninformed users of these outputs or 
uninformed supplies of credit against fraud. This result is the central 
hypothesis of this paper. We illustrate the effects of unlimited versus 
limited liability in a simple agency model of the firm. The model 
identifies the impact of the cost of capital on the organization of the 
firm. 

The model yields the testable prediction that with mobile resources 
unlimited liability firms should be smaller than their limited liability 
counterparts. Tests on average firm size, using data on eighteenth- 
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century Scottish banks and modern U.S. law firms, support our pre¬ 
dictions. Results on the impact of limited liability status on average 
income per lawyer may also be interpreted within our model. 

The theory is richer than these narrow predictions since it offers a 
consistent private-interest explanation for the removal of these un¬ 
limited liability restrictions on Scottish banks. In the light of changes 
in the relative costs of rent protection, other instruments protect rents 
more efficiently. Equilibrium features of the current U.S. market for 
legal services remain an outstanding puzzle. 


References 

Becker, Gary S., and Stigler, George J. “Law Enforcement, Malfeasance, and 
Compensation of Enforcers.”/. Legal Studies 3 (January 1974): 1-18. 

Benston, George J.; Hanweck, Gerald A.; and Humphrey, David B. "Scale 
Economies in Banking: A Restructuring and Reassessment.” J. Money, 
Credit and Banking 14, no. 4, pt. 1 (November 1982): 435—56. 

Cavitch, Zolman. Business Organizations: With Tax Planning. Vol. 4A. New 
York: Bender, 1983. 

Checkland, S. G. Scottish Banking: A History, 1695-1973. Glasgow: Collins, 
1975. 

Easterbrook, Frank H., and Fischel, Daniel R. ‘Limited Liability and the 
Corporation.” Univ. Chicago Imw Rev. 52 (Winter 1985): 89-117. 

Friedman, Milton, and Schwartz, Anna J. A Monetary History of the United 
States, 1867-1960. Princeton, N.J.: Princeton Univ. Press (for NBER), 
1963. 

Halpern, Paul; Trebilcock, Michael; and Turnbull, Stuart. “An Economic 
Analysis of Limited Liability in Corporation Law,” Univ. Toronto Law /. 30 
(Spring 1980): 117-50. 

Manne, Henry G. “Our Two Corporation Systems: Law and Economics.” 
Virginia Law Rev. 53 (March 1967): 259-84. 

Meiners, Roger E.; Mofsky, James S.; and Tollison, Robert D. "Piercing the 
Veil of Limited Liability.” Delaware J. Corporate Law 4, no. 2 (1979): 351-67. 

Mill, John Stuart. ‘‘Testimony.” In Report from the Select Committee on Invest¬ 
ments for the Savings of the Middle and Working Classes, vol. 19. London: House 
of Commons, 1850. 

Peltzman, Sam. “Toward a More General Theory of Regulation.”/. Law and 
Econ. 19 (August 1976): 211-40. 

Posner, Richard A. “The Rights of Creditors of Affiliated Corporations.” 
Univ. Chicago Law Rev. 43 (Spring 1976): 499-526. 

Shannon, Herbert A. “The Coming of General Limited Liability.” Econ. Hist. 
2 (January 1931): 267-91. 

Shapiro, Carl, and Stiglitz, Joseph E. “Equilibrium Unemployment as a 
Worker Discipline Device.” A.E.R. 74 (June 1984): 433-44. 

Smith, Adam. An Inquiry into the Nature and Causes of the Wealth of Nations. 
1776. Reprint. New York: Random House, 1937. 

Stigler, George J. “The Theory of Economic Regulation.” Bell /. Econ. and 
Management Set. 2 (Spring 1971): 3-21. 

Woodward, Susan E. “On die Economics of Limited Liability.” Working Pa¬ 
per no. 371. Los Angeles: Univ. California, Dept. Econ., 1985. 



High School Graduation, Performance, 
and Wages 


Andrew Weiss 
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Using data from the Panel Study of Income Dynamics and a propri¬ 
etary sample of semiskilled production workers, this paper investi¬ 
gates the reasons for the discontinuous increase in wages associated 
with graduation from high school. I find a discontinuous decrease in 
workers’ propensities to quit or be absent. However, I do not find 
that high school graduates have a comparative advantage in produc¬ 
tion jobs requiring more training, nor in either sample is there a 
discontinuous increase in required training associated with the jobs 
held by high school graduates. The wage premium associated with 
graduation from high school appears to be procyclical: falling dur¬ 
ing slumps, periods in which employers are likely to be hoarding 
labor and in which quits and absences are least important to firms. 
There is also some evidence suggesting that prior quits have a larger 
effect on the wages of high school graduates than on the wages of 
high school dropouts. 


I. Introduction 

It has often been noticed that graduation from high school induces a 
discontinuous upward shift in the relationship between schooling and 
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wages. For example, Hashimoto and Raisian (1985) find that comple¬ 
tion of high school increases the earnings of men by between five and 
six times as much as does completion of a year of education that was 
not associated with completion of high school. This effect held re¬ 
gardless of whether workers were employed at small, medium, or 
large firms.’ 

Using data from the Panel Study of Income Dynamics (PSID), I 
found that even when education squared and education cubed were 
included as additional independent variables in a wage equation, the 
increase in wages associated with completion of high school was more 
than three times as great as the increase in wages associated with 
completion of eleventh grade (see table 9 below). Hence the discon¬ 
tinuity in wages associated with completion of high school does not 
appear to be due to nonlinearities in the relationship between educa¬ 
tion and wages. 

I investigated possible reasons for this discontinuity. Using a pro¬ 
prietary data set of semiskilled manufacturing workers, I found that 
high school graduation was associated with a discontinuous fall in quit 
propensity and absenteeism. On the other hand, there is not a discon¬ 
tinuous increase in output associated with completion of high school, 
nor did I find that high school graduates had a comparative advan¬ 
tage in the more complex jobs in this study (job assignment was ran¬ 
dom). 

This evidence suggests that at least part of the relationship between 
secondary education and wages may be due to the discontinuous de¬ 
crease in quit and absenteeism rates associated with completion of 
high school. 2 

1 The following data, from the Current Population Survey, appear in Hashimoto and 
Raisian (1985, p. 730) (/-statistics are in parentheses): 


Regression Results or Male Earnings 




U-S. Firm Size 


Small 

Medium 

Large 

Years of schooling 

.0367 

.0362 

.0296 


(5.6) 

(4.2) 

(5.0) 

High school graduate 

.1630 

.1735 

.1286 


(4.9) 

(3.9) 

(4.5) 

University graduate 

.0630 

.0808 

.1590 


(1.7) 

(1.8) 

(5.6) 


8 In two-stage wage equations estimated using the PSID, the predicted probability of 
a worker’s quitting had a significant negative effect on wages. In particular, when high 
school graduation was not included directly in the wage equation, 1 estimated that 
workers with a 10 percent higher predicted probability of quitting received between 7 
percent and 10 percent lower wages, depending on the form of the wage equation 
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The PSID data support this view. In the data high school graduates 
have lower quit rates than would be expected from a continuous 
relationship between quits and education. On the other hand, there is 
not a discontinuous increase in required training for jobs held by high 
school graduates. Finally, it appears that the wage premium associ¬ 
ated with high school graduation is procyclical: rising during booms, 
when quits and absences are most harmful to employers, and falling 
during slumps. (This last finding is especially tentative since problems 
of collinearity hindered me in separating procyclical effects of high 
school graduation on wages from procyclical effects of education on 
wages.) 

Because there are unlikely to be demographic characteristics that 
affect quit propensity (or the propensity to be absent) and do not 
directly affect wages, it is difficult to use standard data sets to estimate 
the impact of differences in quit propensity on wages. However, Mir- 
vis and Lawler (1977) directly calculated the cost of quits and absen¬ 
teeism for a sample of bank tellers. Their measurements suggest that 
a considerable portion of the wage premium associated with gradua¬ 
tion from high school can be explained by the lower propensities to 
quit or be absent of high school graduates. 3 


being estimated. Of course, the negative correlation between wages and quit propensity 
is due in part to the positive correlation between high school graduation and quit 
propensity already mentioned. This problem would be avoided if high school gradua¬ 
tion was included not only indirectly in the wage equation as one of the instrumental 
variables predicting quit propensity but also directly as one of the independent vari¬ 
ables. When both predicted quit propensity and high school graduation were included 
in a wage equation, both were statistically significant; however, in that case there were 
no variables affecting quit propensity that did not directly affect wages. Hence, the 
model was identified only by functional form restrictions. Consequently. I did not 
pursue this approach further. The reader may obtain copies of the two-stage estimates 
by writing directly to the author. 

3 Mirvis and Lawler (1977) calculated that the total cost of turnover of bank tellers 
was 85 times as large as their daily earnings plus benefits. Hence, if graduation from 
high school decreases a worker's probability of quitting during his or her first year on 
the job from 20 percent to 10 percent (assuming the average work year was 240 days), 
high school graduation would increase earnings of newly hired bank tellers by 4.4 
percent. This calculation was made by assuming a constant quit rate and 240 workdays 
per year so that a worker with a 10 percent per year probability of quitting has a daily 
quit probability of .000439 and a worker with a 20 percent per year quit probability has 
a daily quit probability of .000929. Let the wages of the low-quit-rate and high-quit-rate 
worker be denoted u> a and to,, respectively. Let the cost of a quit be 85a< 0 . Then in 
equilibrium 

w« + 85 x ,0004S9wn = + 85 x .000929u>n, 

l.0375w o = u), + .0790w„, 
u),, = 1.044u’). 

Mirvis and Lawler also estimated that the total cost of absenteeism for a sample of 160 
bank tellers was more than twice the cost in salaries and benefits. Thus if. as estimated 
below, high school graduation results in a 14 percent decrease in the percentage of days 
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I conclude from this research that standard estimates of rates of 
return to education are capturing, in part, the higher earnings of 
high school graduates associated with their lower quit rates and lower 
rates of absenteeism. To the extent that these traits were not learned 
in secondary’ school, the standard models are overestimating rates of 
return to secondary schooling. To the extent that these traits were 
selectively learned in primary school, standard models are underes¬ 
timating the rates of return to primary education. 4 

These biases obtain in both signaling and human capital theoretic 
models of schooling and wages.’’ In the signaling model, individuals 
choose their length of secondary schooling to signal these traits to 
potential employers (see Spence 1974). Schooling is an effective signal 
of these traits because of the same attributes that give workers low 
quit propensities and low rates of absenteeism are likely to give them 
low nonpecuniary costs of schooling. It is unlikely that a low quit 
propensity can be directly observed for new entrants into the labor 
force. High school transcripts generally do not contain information 
on absenteeism or tardiness, and high schools generally either fail to 
respond to requests by firms for transcripts or respond too slowly to 
affect hiring decisions (see Bishop 1986). Even if low quit and absen¬ 
teeism propensities were directly observed by firms but not by the 
researcher (as in some human capital models), standard estimates of 
rates of return to education would still be biased upward. However, 
schooling decisions would not be distorted (see Griliches [1977] and 
Hausman and Taylor [1981] for analyses of the role of unobserved 
attributes in human capital models). 

Alternatively, one could consider a human capital model in which 
the same unobserved traits that cause individuals to have low quit and 
absenteeism rates increase the efficiency with which they learn in 
school—increasing the returns to schooling and making it more likely 
that students complete high school. Then the increase in wages associ¬ 
ated with graduation from high school would be due to the superior 


absent, then one would expect that at a 6 percent absence rate the negative correlation 
between absenteeism and education leads to roughly a 2 percent higher wage for high 
school graduates working as bank tellers. (This calculation is made by noting that a 
worker with a 6 percent absence rate is paid 12 percent less than a [hypothetical] 
worker with zero expected absences, while a worker with a 6,84 percent absence rate is 
paid 13.68 percent less. Hence decreasing one's expected absence rate from 6.84 to 6 
percent would increase one's earnings by 1.95 percent.) 

4 Studies of rates of return to education in countries in which there are many early 
school leavers typically estimate very high rates of return to primary schooling; these 
estimates are much higher than estimated returns to secondary schooling (see Psacha- 
ropoulos 1981). 

* Note that in both the sorting models and some versions of the human capital model 
the error term in an estimated earnings equation is correlated with the education 
variable, causing the estimated coefficient to be biased. 
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cognitive skills of high school graduates. However, I find this last 
explanation unpersuasive and not supported by the data. 

II. Overview of the Data 

The data for this study come from the PSID and from a proprietary 
data set that 1 assembled consisting of the personnel files of 2,920 
newly hired, semiskilled production workers employed by a high- 
wage, unionized firm at three widely separated geographic locations. 

These data sets have different advantages and drawbacks. Conse¬ 
quently, I am encouraged that the overall story is consistent with the 
evidence in both data sets. The PSID data have two major disadvan¬ 
tages. First, they do not contain a direct measure of output or a good 
measure of absenteeism. Second, since education typically affects job 
assignments and promotional opportunities, it would be difficult, if 
not impossible, using the PSID or other standard data sets to deter¬ 
mine whether lower quit rates of high school graduates are due to the 
jobs they have or to attributes of high school graduates. 

The main drawback from using data derived from the personnel 
files of a single firm is that they may be subject to serious sample 
selection biases. That issue is addressed later in this section, where I 
argue that the effects of sample selection bias are likely to be small. In 
Section VI, I show that, to the extent that sample selection biases are 
present, they are likely to lead to underestimates of the negative cor¬ 
relation between high school graduation and propensities to quit or 
be absent. 

On the other hand, using data on workers in similar jobs at a single 
firm has several important advantages over standard survey data. 
First, I was able to obtain detailed records of the physical output of a 
large number of workers and expert evaluation of the complexity of 
the jobs to which they were assigned. Second, by limiting the sample 
to workers on similar jobs at the same firm, I was able to focus on the 
effects of individual attributes on output, absenteeism, and quit pro¬ 
pensity, holding constant firm and job effects. Third, since these data 
were copied from personnel records, they are likely to be more accu¬ 
rate than those obtained in surveys. The difference in accuracy is 
especially relevant for data on absences and job characteristics. (Mel¬ 
low and Sider [1983] found that 42.3 percent of the workers surveyed 
reported for themselves a three-digit occupation different from the 
one reported by their employer.) Fourth, because this data set con¬ 
tains direct measures of three aspects of productivity—output per 
hour, absenteeism, and quits—it is possible to measure the effects of 
education separately for each of those aspects of productivity. Fifth, 
because the data include a measure of the complexity of the job to 
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which the worker was assigned and because job assignment was 
(nearly) random, 1 could estimate whether better-educated workers 
have a comparative advantage on more complex jobs. 6 7 Finally, be¬ 
cause in this sample the wage schedule faced by a worker was inde¬ 
pendent of his actual or predicted performance or absenteeism and I 
was able to partially control for alternative opportunities, I avoided 
some of the problems present in standard data basis in which wage 
differences bias estimates of the effects of demographic characteris¬ 
tics on quits, absences, or productivity. 

Although all workers at each location in the sample received the 
same wage, they obviously did not have the same alternative opportu¬ 
nities. The effect of differences in alternative opportunities on quits 
and absences is discussed in Section III. A detailed description of this 
data set is presented in Appendix A. 

Before investigating the effect of education, and particularly 
graduation from high school, on the performance of workers in the 
sample. I first examine the effect of education and other observable 
characteristics on previous wages for some of the workers in the 
sample. 

Job applicants at one of the plants used in constructing the data 
base (referred to as plant A) were asked their wage at their most 
recent job and whether or not they were currently employed at that 
job. Using those data, I estimated the effect of an additional year of 
education on the previous pay of workers who were employed when 
they applied for their current job.' Column 2 of table 1 contains 


6 In general, studies that rely on productivity measures of exjterienced workers 
within a given job classification are subject to important sample truncation biases. 
Landau and Weiss (1985) have shown that, in a model with heterogeneous labor, if all 
workers have to meet a given productivity standard before being assigned to a given job 
and are promoted if they exceed some higher standard, then even if output (2 were 
equal to education (or experience) times ability (f), on any given job, productivity could 
be uncorrelated or negatively correlated with education (or experience). For example, 
if output Q = ix, where x could be experience or education and the normalized density 
of unobserved ability f(i) = A» p , then average productivity would be uncorrelated with 
the observed characteristic x. If promotion criteria are less stringent for the better 
educated, as would appear from table 4 of Medoff and Abraham (1981), a negative 
correlation between productivity and education could obtain for a wide range of distri¬ 
butions of unobserved ability. This problem docs not arise in the data here since none 
of the workers was promoted, job assignments were (nearly) random, and sample 
selection bias was likely to be small. 

7 I excluded from the sample workers that were unemployed when hired since the 
wage on their previous job, one that they either could not keep or found less desirable 
than unemployment, does not seem representative of their expected earnings. Wages 
at the previous job were used because at the firm studied wages for experienced work¬ 
ers are a function solely of seniority and the output of their pay group. Obviously this is 
a biased estimate of the effect of education on earnings for workers in the population: 
the workers in the sample chose to leave the jobs for which I have recorded their 
earnings. In Sec. VI, I analyze the effect of this bias. 
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TABLE I 


Natural Log or Hourly Earnings in 1979 
(Current Population Survey Sample versus Weiss Sample) 


Intercept 

CPS* 

(1) 

Plant A in 
Weiss Sample* 

(2) 

Nonwhite 

-.065 

-.051 


(-5.14) 

(-2.11) 

Female 

-.247 

-.253 


(-35.47) 

(-13.06) 

Nonwhite x female 

.050 

.052 


(2.81) 

(1.43) 

Education 

.018 

.166 


(2.28) 

(2.25) 

(Education/10)* 

.098 

-.519 


(3.61) 

(-1.86) 

Job tenure 

.013 

.034 

(7.15) 

(90) 

(Job tenure/10) 2 

- .036 

-.184 

(-11.61) 

(-2.64) 

Other experience 

.014 

.042 


(8.01) 

(3.57) 

(Other experience/10) 2 

-.021 

- .056 


(-11.13) 

(-4.84) 

Education x job tenure 

.00060 

.0015 

(5.15) 

(.50) 

Education x other experience 

- .00033 

-.0017 


(-3.29) 

(-1.99) 

Sample size 

18,551 

1.272 

R 1 

.537 

.204 


Noil — 1 lie iliutlcr R in tnl. I tiitiifMrrii with ml 1 » due. m pjn. to the tnuilk-i lAllite of the independent 
\.irial>lr« among worker* at plant A 

* H»r oilier independent variable* in the col I regrewon were whether the worker belonged to a union, die 
IHMtemage of worker* umont/ed in the worker* indmirv, immunms between union mfwlimhi|i and firm «tr, 
and plant si/.e 

I’he data tn tul. 2 arc from the only plant in ilm uutly toi whk li wage data on the previous job ate available. I be 
dependent varialde » the logarithm of the wage ai the mow retent jol> divided f»v the mean wage tn the cionorm at 
the time the job wat held. 01 ln(mo*t recent proton* pav -- average (*ay in the t'mied States at tliat date). On the 
prevnt job the lifetime wage* of all the workers are approximately identical. 


estimates of the effect of various demographic variables on the 
logarithm of the previous hourly wage of workers at that plant. Col¬ 
umn 1 reproduces the coefficients estimated by Mellow (1982) for 
workers in the 1979 Current Population Sun , ey (CPS). When one evalu¬ 
ates the marginal effect of education at the mean values in each sam¬ 
ple for education, tenure, and experience, 8 d In wage/deducation = 
•043 for the CPS sample and .037 for this sample. 

I able 1 can be used in various ways depending on how ambitious 
one wishes to be. First, the estimates of d In wage/deducat ion provide a 

H ‘Ihe relevant mean values for the 1979 CPS are mcan(education) *12.64, 
mean(tenure) * 6.51, and mcan(other experience) = 11.57. where other experience is 
measured by age - education - tenure - 6, 
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measure of the effect of education on earnings (in their previous job) 
for the subsampie of workers for whom there are data on previous 
wages. One can then estimate how much of that effect can be ex¬ 
plained by the partial correlation between education and various 
aspects of performance for workers in that limited sample. This 
approach does not make any assumptions that the sample is represen¬ 
tative of a larger population but simply seeks to find those factors that 
contributed to earnings differences on previous jobs for workers 
within this subsample. It does assume that performance on current 
jobs is correlated with performance on previous jobs. I shall, however, 
also assume that workers for whom usable pay data were available are 
representative of workers in the sample, so that a year of education 
has approximately a 4 percent effect on the hourly wage on other jobs 
for the workers in the larger sample. 9 10 

A second use of the data in table 1 is to provide a partial check of 
the importance of sample selection bias for the sample if one wishes to 
generalize the results outside the sample. Since the major concern 
here is in explaining the relationship between education and wages, it 
is useful to check to see if the return to education in the sample is 
biased. This is a potentially serious problem because one would ex¬ 
pect that better-educated individuals who apply for these jobs are less 
representative of their schooling cohort than less well educated work¬ 
ers. However, the estimated value of the return on education for 
workers for whom there are wage data is roughly the same as returns 
estimated using the CPS sample. Thus it does not seem that this 
sample differs from the CPS sample in ways that grossly distort the 
effect of education on earnings. In addition, the sign of the effect of 
race, sex, education, tenure, and experience on the logarithm of the 
(previous) wage is the same for this sample as for the randomly se¬ 
lected CPS sample. 

The personnel practices at these plants also provide grounds for 
believing that the sample is representative. Only 22 percent of appli¬ 
cants whose applications were reviewed were rejected. Of those, 85 
percent were rejected because of a low score on the Crawford Physical 
Dexterity Test. 0 The estimation procedures included the worker’s 
score on the dexterity test as an independent variable, thus eliminat- 

9 I had usable data on previous pay for 77 percent of the workers employed at plant 
A (those workers constituted 43 percent of the entire sample). 

10 Of course some residual sample selection bias occurs if unobserved attributes that 
lead workers to wait in line longer, such as the ability to withstand the cold, are cor¬ 
related with performance and with observed attributes studied here. Applications were 
reviewed in order of the applicant's place in line with some adjustments to ensure racial 
and sexual balance. At the location used for table 1 the individual’s place in line is 
known. It was uncorrelated with any of the measures of performance. Consequently, 1 
have assumed that this source of bias is small and have disregarded it. 
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ing that potential source of sample selection bias. In addition, the 
average pay increase for workers for whom there were wage data was 
103 percent, and workers waited for more than 36 hours in freezing 
temperatures to receive job application forms. This evidence indicates 
that the firm was paying above-market wages. These high wages are 
likely to reduce the sample selection problems derived from more 
“able” applicants not applying to the firm, where ability refers to 
unobserved worker characteristics that are correlated both with ob¬ 
served characteristics and with productivity. These high wages and 
the concomitant excess supply of job applicants were routine features 
at the manufacturing locations of this unionized firm. At another 
location of this firm, when recall notices were mailed to former em¬ 
ployees, 90 percent of those who had found alternative work quit 
their jobs to return to work for the firm. 

I was also able to test for possible biases introduced by the job 
assignment decisions of the firm. The personnel officers of the firm 
maintained that newly hired workers were randomly assigned to 
entry-level jobs. I tested whether this policy was followed in practice 
by regressing the measure of the complexity of the job to which a 
worker was assigned against all the observed demographic character¬ 
istics of the workers, including schooling, previous work experience, 
and scores on each part of the physical dexterity test. Those explana¬ 
tory variables were neither economically nor statistically significant, 
nor were they jointly significant." Consequently 1 have assumed that 
sample selection and job assignment biases were small and have not 
corrected for them. 12 

Sections III and IV present evidence linking the wage premium 
received by high school graduates to their low propensities to quit or 
be absent. In Section V, I present corroborating evidence suggesting 
that the wage premium received by high school graduates is unlikely 
to be due solely to skills learned in high school. Section VI discusses 
the effects of selection bias on the results. Section VII contains some 
concluding remarks. 

III. Models 

Typically an individual’s choice of a level of education is a function of 
both observable and unobservable traits. The traits that are observ- 

II 1 did find, however, that males were assigned to simpler jobs. 

l * Alternatively l could have used the Bloom and Killingswonh (1984) procedure to 
correct for sample selection bias. (Because the characteristics of excluded observations 
are unknown, the standard Heckit correction cannot be used.) However, their proce¬ 
dure is extremely sensitive to assumptions about the distribution of the error term in 
the selection equation. Indeed, as they point out, the equation being estimated is 
identified only if the assumed distribution of the error term of the selection equation 
» nonlinear. See Muthln and Jbreskog (1983) for a discussion of this issue. 
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able for this sample and that affect the schooling decision include age, 
race, sex, and manual dexterity. I grouped together under the rubric 
“stick-to-itiveness” all the unobserved attributes that affect an individ¬ 
ual’s choice of a level of schooling. Stick-to-itiveness represents the 
combined effect on schooling level of characteristics such as self- 
discipline, desire for variety, and susceptibility to illness or to the urge 
to “take a day off.” Years of schooling and high school graduation 
were used as proxies for the traits referred to as stick-to-itiveness. 

In addition to representing these unobserved characteristics, edu¬ 
cation also indicates a level of training. One skill that is likely to have 
been acquired in school is the ability to learn complex tasks. Of 
course, that skill may have been acquired prior to the years of school¬ 
ing across which the sample differs (almost the entire sample had at 
least 9 years of education) and influenced the individual’s choice of a 
level of education. 1 assume that skills such as the ability to learn 
complex tasks that affect the productivity of workers and that may 
plausibly have been learned in secondary school were learned there. 
That is. 1 intentionally biased the analysis in favor of a learning expla¬ 
nation for the correlation between education and wages. 

The performance equations estimated are whether or not the 
worker quit during his first 6 months on the job, absenteeism (both 
days absent and occasions of absenteeism), and output per hour dur¬ 
ing the first month on the job as a fraction of expected output given 
the complexity of the job. (These normalizations are routinely per¬ 
formed by the industrial engineering staff as part of their efforts to 
compute the piece rate for different jobs.) The critical independent 
variables for the analysis are years of education, high school gradua¬ 
tion, job complexity, and a measure of the “match” between education 
level and job complexity: match * [education - mean(education)] x 
[job complexity - mean(job complexity)). If better-educated individ¬ 
uals have a comparative advantage in more complex production jobs, 
the coefficient on the match term in an output equation would be 
positive. 

Job complexity was defined as the logarithm of the number of 
weeks the plant’s industrial engineering staff estimates it should take 
a new employee to learn the job (achieve the expected productivity 
rating for an experienced worker on that job). The main component 
in the industrial engineers' calculation is the number of times per 
week an experienced worker performs the task (see fig. 1). Mean job 
complexity is the average level of job complexity for the subsample ot 
workers at each location. 

Although there is a large literature examining the effect of job 
enrichment and job complexity on job satisfaction and performance, 
there is considerable controversy in interpreting these results. There 
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Fig. I 


are both biases in experimental design—often the studies rely on 
screened volunteers who are given pay supplements (see Fein 1975)— 
and biases in the reporting of results: Hackman (1974) points out that 
papers describing the results of job enrichment programs are typi¬ 
cally written by the consultants that implemented them. Consultants 
are more likely to publicize their successes than their failures. 

In general one would expect stick-to-itiveness (those unobservable 
traits that lead individuals to complete high school) to have its greatest 
effect on quit propensity and to have a lesser effect on other aspects 
of behavior such as absenteeism. On the other hand, one direct effect 
of education and of high school graduation is to improve alternative 
opportunities, increasing the probability of a quit. 

A. Quits 

Let S, denote worker i’s present value of his current job, V’, denote t’s 
present value of his best opportunity elsewhere, and M, denote i's 
mobility costs (both real and psychic) of a job change (housework and 
leisure are counted as jobs). Assume that worker i is risk neutral and 
quits if and only if 


V, - St > M h 


( 1 ) 
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Denote the time discount rate of workers by r, worker »’s age at the 
start of the observation period by and the value of an individual's 
alternative and present job at time l by V,(t) and S,(t), respectively. 
Then, assuming that male workers anticipate working until they are 
65 years old and normalizing t - 0 for the time when the quit decision 
is made, we get 

_„ _ f 65 “ a, + 

V, - s, » e- n [V,{t) - Sj(t)]dt, (2) 

Jo 

where F is a dummy variable indicating whether a worker is a female, 
and p ( > is an estimated parameter of the problem. If females anticipate 
working fewer years. (3 0 is negative. Next, let pi(t) represent the proba¬ 
bility that individual i changes jobs after the observation period and 
before period t, and let the value of that new job be a weighted 
average of the value of the previous job and some constant term B,. 

Assume that the values of the alternative and the present job grow 
at an exponential rate so that 

VAt) - e al {[ 1 - PAW(Q) + M0bV,(0) + (1 - y)BA0)]}, (3a) 

S,(t) = - A(t)]5,(0) + p,(t)[yS,(0) + (l - y)fl.(O)]}, (3b) 

where 0 < y < 1. Then, with 8 = (r - u) and a, = 1 - (1 - y 
worker i quits if and only if 

^- # ‘[V < (0) - S,(0)j* > M,. (4) 

Jo 

To obtain an estimable quit equation from (4), assume 

aV^O) = X,P, (5) 

aS(0) = X 2 p 2 (6) 

M = X s p s . (7) 

Denoting (e 6(65 + _ i]/g jjy and substituting (5)-(7) 

into (4), we get 


n fl ifg(0» Po^K^lPl ~ -^2®2) “ > 0 /os 

^ [0 otherwise. 

Clearly, the net gain from changing jobs, the left-hand side of (8), will 
be measured with error. Assume that this error term, denoted pg, is 
distributed A r (0, cr 2 ). Therefore, the quit equation estimated is 



8> PoTKXjPi - X 2 0 2 ) - X s 0 3 > pq 
otherwise. 


( 9 ) 
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The worker characteristics that are measured and that affect alter¬ 
native opportunities include education, race, sex, age, geographic lo¬ 
cation, dexterity (as measured by scores on the physical dexterity test), 
and employment status when hired (a worker who was on a tempo¬ 
rary layoff from a desirable previous job might be likely to quit when 
recalled from the layoff). 13 The factors affecting alternative opportu¬ 
nities could also affect job satisfaction. For example, one would expect 
workers that were employed at the time of application to have a 
significantly higher level of job satisfaction (one reason they left the 
previous job was the anticipation of increased job satisfaction) and 
hence to be less likely to quit. There are also job-related characteris¬ 
tics, such as job complexity, that affect job satisfaction but not alterna¬ 
tive opportunities. 14 

Finally, one would expect stick-to-itiveness (as measured by high 
school and college graduation) and the worker’s marital status to im¬ 
pose additional mobility costs. 15 Those variables are not multiplied by 
g{a, 8, poF). They enter directly into the quit equation as elements of 
Xj. I also included education as an element of X 3 . Thus education was 
allowed to directly affect quit propensity if students learn not to quit 
during their postprimary years of schooling (the years over which 
schooling levels differed in the sample). Since education may affect 
alternative opportunities and job satisfaction, I also included educa¬ 
tion times gfa, 8, fj () F) as a right-hand variable in the estimated quit 
equation. 

As can be seen in table 2, the major hypothesis is confirmed. High 
school graduation has a strong negative effect on a worker’s probabil¬ 
ity of quitting. Because this effect is independent of the direct effect 
of schooling, we can reject the hypothesis that the reason better- 
educated individuals have lower quit propensities is that they learned 
not to quit in secondary school. 

Equation (9) was estimated by the method of maximum likelihood, 
using a (tooled sample from plants A and B. Individuals were omitted 
if they were laid off before being with the firm for 6 months, and 
selection bias was avoided by also omitting workers who would have 
been laid off before 6 months had passed had they not already quit. 

ls Of course employers are concerned with the total effect of education on the work¬ 
er's probability of quitting, taking into account the better alternative opportunities 
available to the better-educated workers. However, we arc concerned with explaining 
wage differences in the market, not at the firm being studied, where there are no wage 
differences due to education differences. In the market equilibrium, better-educated 
workers have higher wages as well as better alternative opportunities. It is differences 
in quit propensities, not of the effect of alternative opportunities on quits, that contrib¬ 
ute^ differences in the equilibrium wage at different education levels. 

Contemporaneous wage was not included as an element of Xg because all workers 
** a d substantively the same expected lifetime wage on their current job. 

See Mincer (1978) for an analysis of the effect of marital status on worker mobility. 
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TABLE 2 

Probability ok Quitting within First 6 Months on the Job 
(Maximum Likelihood Estimation Procedure) 



Estimated 

Coefficient 

or 

Value 

Standard Errors 

Independent Variable* 

Gradient 

Method 

Hessian 

Method 

White’s 

Method 

8 

.26 

2.04 

.61 

.28 

(3 (effect of being female on 

-6.81 

(-.13) 

42.3 

(-.42) 

10.6 

(-.91) 

4.19 

anticipated work life) 

High school graduate 

-.342 

(-.16) 

.140 

(-.63) 

.135 

(-1.62) 

.136 

College graduate 

- .242 

(-2.44) 

.367 

(-2.54) 

.397 

(-2.61) 

.444 

Education 

-.07 

(-.66) 

.171 

(-62) 

.057 

(-.59) 

.035 

g(a, 8, (IF) x education 

.015 

(-.44) 

.143 

(-67) 

.044 

(-76) 

.029 

Married 

- .091 

(.11) 

.078 

(34) 

.077 

(.53) 

.077 

gin, 8, p/') x employed 

-.098 

(- L16) 
.774 

(-1.I8) 

.231 

(-1.17) 

.107 

at application 

Number of observations 

Log likelihood 

2,146 

-741.99 

(-13) 

(-.43) 

(-.92) 


Non — /-statistic* sue m parentheses. Throughout this paper 1 loosely uvc the term M f H in rciei to the estimated 
coefficient divided b* the standard enor 

* Other independent variables included v ores on each hall of the demerits test, race-location interactiom, 
location effects, age. and an intercept trim, I used thtee different methods for calculating the standard errors 
because there m tin consensus as to whn h is the correct technique. If the model is totreuiv specified. the three 
methods give the same asvmptotK estimates. The estimates obtained are suffit tetiUy similar to provide some 
confidence that the model is not growls rnisspe* tfied. Computational ions precluded performing the model 
specification test suggested bv White (1982) 


(All layoffs were made strictly by seniority.) Because the likelihood 
function is almost flat with respect to changes in 8, offset by compen¬ 
sating changes in the vector of variables multiplied by 8. the standard 
errors of 8 and of the coefficients of variables multiplied by g(a, 8, 
PoF) are high. Although statistically insignificant, they have values 
consistent with the model. Individuals such as white males who have 
better alternative opportunities are more likely to quit these jobs; 
workers who were employed when they applied for these jobs are less 
likely to quit. Similar confirmation was provided by the negative value 
of Po- The estimates suggest that females in the sample anticipate 
spending 7 fewer years in the labor force than males. This finding is 
consistent with unpublished research by Jacob Mincer. Using the Na¬ 
tional Longitudinal Study sample, he finds that women spend roughly 
25 percent less time in the labor force than men. Finally, the similarity 
in the standard errors obtained using the gradient, Hessian, and 
White (1982) methods suggests that the model is not grossly misspeci- 
fied. 

To obtain more precise estimates of the coefficients and to make 
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TABLE 3 


Probability of Quitting within First 6 Months 


Independent Variable 

Coefficient 

/-Statistic 

A Quit Probability 
from a One-Unit 
Chance in the 
Variable Evaluated at 
Quit Probability of 

10% 20% 

Intercept 

-.922 

-1.54 



High school graduate 

-.341 

-2.54 

-.059 

-.096 

College graduate 

-.280 

-.72 

-.049 

-.078 

Education 

-.106 

-1.32 

-.019 

-.030 

Married 

-.097 

-1.26 

-.017 

- .027 

h{a, 8, ^F) x education 

.0055 

1.45 

.00097 

.0015 

h(a, 8, $F) x age 

-.00029 

-.32 

-.000052 

— .00(8)83 

h(n. 8. */F) x male 

.0012 

.16 

.00022 

.00034 

h(a, 8, </F) x white 

.068 

4.58 

.012 

.019 

x South 

h{a, 5, ^F) x South 

- .036 

-2.34 

- .0062 

- .0099 

li(a, 8, yF) x white 

.0061 

1.00 

.0011 

.0017 

X Midwest 

h(u, 8, */F) x employed 

-.0023 

-4.95 

- .0040 

- .0064 

at application 

h(a, 8, yF) x pins section 

.00020 

.34 

.000035 

.000056 

of dexterity test 

kin, 8, yF) X screws section 

.00094 

1.94 

.00016 

.00026 

ol dexterity test 

Number of observations 

2,146 





Noif — 4/ srf equal io • .25, fi set equal iv .05. 0 121 of oWnatKiw had (J & 1, the mean of hla. ft. W) it 16 12. 


use of Mincer’s findings on the shorter work life of women, we can 
reformulate equation (9) as 

q _ 1 if V'KXjPi - X a p a ) ~ Xs^s > m> 

” 0 otherwise, 

where h(a, 8, ^F) = [r ~ ,,5< 1 " 25>Mfi5 ~ a * c) - 1]/-.05. That is. we can 
impose the restrictions that r - p. = .05 and that the effective work 
life of women is 25 percent shorter than that of men. 

Equation (9’) was estimated using a probit estimation procedure. 
The estimated coefficients and the eff ect of a change in each indepen¬ 
dent variable on a worker's probability of quitting are presented in 
table 3. ,fl 

16 Note that in table 3 job complexity and match were not included as independent 
variables. If they are included as independent variables when (9’) is estimated, the 
sample size falls to 1,532 and the absolute value of all the /-statistics also falls. However, 
none of the results in table 3 is significantly affected: high school graduation is esti¬ 
mated to correspond to a reduction in t he worker's probability of quitting of - .059 at a 
10 percent quit probability and - .096 at a 20 percent quit probability. Similarly, if quit 
** estimated separately for men and women in the sample, the qualitative 
effects in table 3 hold for each subsample. 
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High school graduation has a large and statistically significant effect 
on a worker's probability of quitting. Evaluated at a 10 percent proba¬ 
bility of quitting, high school graduation decreases a worker’s proba¬ 
bility of quitting by almost 6 percent. At a 20 percent probability of 
quitting the decrease is 9.6 percent. The continuous effect of educa¬ 
tion on quit propensity is also negative, suggesting that workers with 
more education have more stick-to-itiveness, while when education is 
interacted with the h () function its coefficient is positive, as would be 
predicted from the better alternative opportunities available to better- 
educated workers. 

The other estimated variables in table 3 are also consistent with the 
model. Being employed when hired by your current employer has a 
negative effect on a worker's probability of quitting. This is consistent 
with our model since those workers incurred greater costs in taking 
this job and hence were likely to have a higher anticipated level of job 
satisfaction. Southern blacks are far less likely to quit than southern 
whites, perhaps reflecting differences in their alternative opportuni¬ 
ties. On the other hand, males do not seem more likely to quit than 
females, suggesting that sex discrimination by other firms may be of 
relatively low magnitude compared with race discrimination in the 
South. 

These results were not sensitive to the particular specification of the 
quit equation. When I estimated either a probit, logit, or linear proba¬ 
bility model with no interactions between the h () function and the 
explanatory variable, the results were substantially unaffected. The 
results from the probit model with no interactions between the h () 
function and the other explanatory variables are presented in table 4. 
In this table, high school graduation continues to have a strong nega¬ 
tive relationship with predicted quit propensity, while the continuous 
relationship between years of education and quit propensity is weak 
and statistically insignificant. Apparently the greater stick-to-itiveness 
of better-educated workers is offsetting their superior alternative op¬ 
portunities, eliminating any discernible relationship between educa¬ 
tion (other than high school graduation) and quits. 

To provide a rough check of whether peculiarities in the distribu¬ 
tion of education were generating the results in tables 2, 3, and 4,1 
plotted in figure 2 histograms of the proportions of quitters and 
nonquitters at each education level. I also checked to see if the 
dummy variable for high school graduation was capturing non- 
linearities in the relation between education and quits. When the 
dummy variable for high school graduation, in the model estimated 
in table 3, was replaced by a dummy variable for completion of elev¬ 
enth grade or completion of thirteenth grade, the coefficients of these 
latter terms were statistically insignificant. As an alternate means of 
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TABLE 4 


Probability of Quitting within First 6 Months on the Job 
(Probit Model with Repressed Interaction Terms) 


Independent Variable 

(1) 

Coefficient 

(2) 

Intercept 

-.561 

-.664 


(-1.00) 

(-.36) 

High school graduate 

- .329 

-.314 

(-2.51) 

(-7.27) 

College graduate 

- .227 



(-.59) 


Education 

-.024 

.0025 


(-.55) 

(.008) 

Education squared 


-.0016 



(-.14) 

Married 

-.082 

-.083 


(-1.10) 

(-1.11) 

Age 

-.021 

-.021 

(-3.64) 

(-3.61) 

Male 

.173 

.173 


(2.05) 

(3.61) 

White-South 

.956 

.954 


(5.09) 

(5.08) 

South 

- .354 

- .352 


(-1.77) 

(-1.76) 

White-Midwest 

.088 

.083 


(.861) 

(.822) 

Employed when hired 

-.392 

-.392 

(-5.20) 

(-5.18) 

Score on pins pari of dexterity test 

.0026 

.0026 

(.280) 

(.27) 

Score on screws part of dexterity test 

.018 

.018 


(226) 

(2.26) 

Number of observations 

2.236 

2.236 

Log likelihood 

-788.8 

- 788.99 


Note —MUiiitks arc in parentheses 


capturing nonlinearities, I added education squared and education 
cubed to the independent variables listed in table 3. With that cubic, 
specification, I found that at a 10 percent quit probability the discon¬ 
tinuous effect on quits from graduation from high school reduces a 
worker’s probability of quitting by 5.1 percent; at a 20 percent quit 
probability the discontinuous effect of graduation from high school 
reduces a worker’s probability of quitting by 8.1 percent. The t- 
statistic on high school graduation given the cubic specification for the 
effect of education is - 1.84. 17 

17 This fall in the (-statistic from including higher-order polynomial terms is, of 
course, not surprising. There are several reasons why I expected the coefficient on high 
Khool graduation to be sensitive to the addition of (education) 2 and (education)* as 
independent variables: (1) there were relatively few quits, (2) the sample clustered 
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Years of Education 


Fir.. 2. —Years of education for quitters and rionquittcrs. Left columns of pairs refer 
o nonquitters, right columns to quitters. 


I checked the finding that high school graduation significantly af- 
ects quit propensities using the 1968-82 PSID sample. As in table 4. 
suppressed the interactions between the h () function and the other 
explanatory variables. The results in table 5 are consistent with those 
i tables 2, 3, and 4. Completion of high school is associated with a 
eduction in the individual's quit rate of roughly one-third, while 
ostprimary schooling (aside from completion of twelfth grade) has 
n insignificant effect on quit rates. These results are also robust to 
hanges in the specification of the quit function and of the error term. 


round an education level of 12.1 (the standard deviation was 1.2), and (3) with the 
ubic specification the independent variables included six different measures of educa- 
on. Given these difficulties, I was surprised to find that the discontinuous relationship 
:tween graduation from high school and quit propensity was still observable with the 
ubic specification for the continuous effect of education on quit probabilities. 


TABLE 5 


Male Private-Sector Employees in PS1D Sample, 1968-82 
A. Quits per Year 


Independent Variable 

Coefficient 

High school graduate 

-.oso 

(-3.58) 

Education 

- .024 


(-2.26) 

Education squared 

.OOSO 


(2.53) 

Education cubed 

-.00011 


(-2.76) 

ln(mean experience while in sample) 

-.025 


(-3.67) 

New entrant 

.014 


(162) 

Nonwhitc 

- .030 


(-5.05) 

Age 

- .0023 

(-5.30) 

Mean city unemployment rate minus mean 

-.003 

national unemployment rate 

(-1-79) 

Number of observations 

3,781 

R' 2 

.16 

Mean quits per year 

.092 


B. Annual Probability oe Quitting (Logistic Model) 


Independent Variable* 

Coefficient 

Standard Error 

Intercept 

1.48 

.590 

Education 

-.59 

.180 

Education squared 

.072 

.020 

Education cubed 

- .0025 

.00065 

Work experience 

-.111 

.0118 

Work experience squared 

.0012 

.00033 

Disabled 

.172 

.119 

Nonwhite 

-.429 

.087 

Married 

- .408 

.083 

Union member 

-.744 

.086 

High school graduate 

-.338 

.132 

Part-time worker 

.263 

.094 

National unemployment rate 

- .084 

.027 

County unemployment rate 

.016 

.062 

County unemployment rate squared 

-.0033 

.004 


Notf..—E ach observation watt weighted by the square root of the numbet of years the individual was in ihe 
sample i-siatutics are in parentheses in pt. A. I did not include wages as an independent variable in these regressions 
suite I was Lanterned with the indnect effect of vaiious demographic characteristics on wages through their effect 
tin quit propensities. 

* Eight location dummies were also included. 
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In Appendix B I show why differences in expected absenteeism rates 
can significantly affect the wages of workers. In this subsection, I 
estimate the discontinuities in expected absenteeism rates associated 
with high school graduation for workers in the sample. 

In estimating the determinants of absenteeism I assume that a 
worker is more likely to be absent the higher is his level of job dissatis¬ 
faction and the lower his stick-to-itiveness. As in the quit equation, job 
satisfaction is measured by the elements of X 2 . Education and high 
school graduation are used as proxies for persistence: an individual 
who had insufficient persistence to complete high school is likely to 
have poorer than expected attendance habits. I assume that both days 
and occasions of absenteeism as percentages of possible days worked 
are linear functions of X 2 , the proxies for stick-to-itiveness, and nor¬ 
mally distributed error terms (consecutive days absent are referred to 
as a single occasion). 

In the quit equation we can separate the effect of education on job 
satisfaction from its effect on stick-to-itiveness by arguing that job 
satisfaction has a greater effect on the quit rates of workers with 
longer expected future work lives. (The education x g(a, 8, p ( >F) term 
captured the effect of education on quits that was caused by its effect 
on either alternative opportunities or job satisfaction.) This option is 
not available in the absenteeism equations: workers are trading off 
one-period gains and losses from absences. Consequently, the esti¬ 
mated coefficients on education and high school graduation reported 
in tables 6 and 7 could be estimating the effect of lower (or higher) 
levels of job satisfaction of better-educated workers on their absence 
rates rather than the effect of stick-to-itiveness on absenteeism. How¬ 
ever, the results from the previously estimated quit equations suggest 
that this is not happening. 

Percentage of days absent was estimated using a Tobit procedure. 18 
I calculated the percentage of days workers were absent during their 
first 6 months on the job. However, because not all the workers in the 
sample completed 6 months of work, those observations were 
weighted by the square root of the number of days for which they 
were observed. I also used occasions of absenteeism as a dependent 
variable. In those calculations I restricted the sample to individuals 
who worked for 6 months. I assumed that occasions of absenteeism 

la The Tobit procedure implicitly assumes that discrete absenteeism data can be 
approximated by a continuous distribution and that the error term in the estimation 
equation is normally distributed but truncated, so that absenteeism rates that the model 
predicts would be negative are recorded as zeros. Of course, theoretically absences are 
also truncated from above: no one can be absent more than 100 percent of the time. In 
this sample the problem did not arise. 
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Days Absent as a Percentage of Days Worked 
(Tobit Estimation Procedures) 


Independent Variable 

Coefficient 

Intercept 

7.38 

(2.54) 

Male 

.398 
(- .95) 

Age 

-.143 

(-5.89) 

Education 

.133 

(.58) 

High school graduate 

-2.35 

(-4.04) 

College graduate 

-1.34 

(-.69) 

Married 

-.046 

(-.13) 

Job complexity 

-.12 

(-.45) 

Match 

-.21 

(-1.10) 

Employed at application 

-1.23 

(-3.56) 

Number of observations 

1,890 

Log likelihood 

-4.781.2 


N'orr.—The data far tables fi-fl <omc from three locations 0 / 
the firm, rather than only the two included tn the quit equations. 
Each observation was weighted l»y the square root ol the numbei of 
days worked, /-statistics arc in parentheses 


are generated from a Poisson arrival process and that the A parameter 
that describes the Poisson process has a gamma distribution across the 
population. The resulting distribution is negative binomial. 19 I com¬ 
puted maximum likelihood estimates for that distribution. (Estima¬ 
tion using Tobit is inappropriate for this problem because the con¬ 
tinuity assumption of the Tobit procedure is grossly violated when 
computing occasions of absences.) 

Whether days or occasions are used as a measure of absenteeism, 
we find that high school graduation has a negative effect on absen¬ 
teeism. This effect is statistically significant regardless of the estima¬ 
tion procedure used and holds even when we attempt to capture 
nonlinearities in the relationship between education and absenteeism 
by allowing for a cubic specification. The continuous effect of second¬ 
ary education on absenteeism is small relative to the discontinuous 
effect of graduation from high school. It seems unlikely that high 
school graduates learned in secondary school the traits that gave them 

iV Phis result comes from Greenwood and Yule (1920). A derivation appears in 
Johnson and Kotz (1969. p. 124). 
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TABLE 7 


Occasions Absent during the First 6 Months 
on the Job 

(Negative Binomial Model) 


Independent Variable* 

Coefficient 

Intercept 

4.46 

(1.03) 

Male 

-.113 

(-2.33) 

Age 

-.031 

(-8.03) 

Education 

- .945 
(-.786) 

High school graduate 

-.362 

(-2.94) 

College graduate 

.389 

(.998) 

Married 

.0344 

(716) 

Match 

- .00736 
(-.238) 

Job complexity 

.00819 

(1.52) 

Employed at application 

- .181 
(-3.84) 

Number of observations 

1.740 

Log likelihood 

1,533.78 


.Von — Srandarti errors were calculated using the Hessian 
method. Mtatistiis are in paiemhct.es. 

* Other independent variables were location, store on the strews 
section of the dexterity test, and race-location interactions for the 
two plants at which race data ^cre available. 


low rates of absenteeism. Instead, the same unobserved characteristics 
that lead to successful completion of high school seem to lead to low 
rates of absenteeism. 


C. Output per Hour 

The final indicator of performance explored was the logarithm of 
normalized output per hour during the worker’s first month at the 
manufacturing facilities studied (see table 8). As previously noted, job 
assignments of newly hired workers were independent of their educa¬ 
tion, scores on either part of the physical dexterity test, and prior 
work experience. They are paid piece rate during that month. I as¬ 
sume that output per hour is a linear function of observed character¬ 
istics, including the match term described above. The distribution of 
jobs was described in figure 1. 

Assigning high school graduates to more complex jobs has neither 
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TABLE 8 

Normalized First-Month Output Estimated by Ordinary Least Squares 


Independent Variable* 

(1) 

Coefficient 

(2) 

(3) 

Intercept 

67.36 

69.05 

71.63 


(5.80) 

(6.24) 

(7.14) 

Male 

8.91 

8.87 

8.87 


(5.81) 

(5.80) 

(5.80) 

Age 

-.11 

-.11 

-.11 

( — 1.15) 

(-1.16) 

(-1.16) 

Education 

1.36 

1.34 

1.53 


(1.63) 

(1.61) 

(1.59) 

High school graduate 

4.44 

2.54 

-.21 

(.64) 

(.45) 

(-.08) 

Match 

.49 


.05 


(.48) 


(.04) 

Job complexity x high school graduate 

-2.64 

-1.55 


(-.73) 

(-.55) 


College graduate 

-12.96 

- 13.12 

- 13.03 


(-2.09) 

(-2.11) 

(-2.10) 

Married 

4.01 

4.02 

4.01 


(2.98) 

(2.98) 

(2.98) 

Score on dexterity test 

.22 

.22 

.21 


(1.52) 

(1.53) 

(1.53) 

|ob complexity* 

7.82 

6.98 

5.63 


(2.45) 

(2.60) 

(5.26) 

Employed at application 

1.48 

1.45 

1.48 


(1.08) 

(106) 

(1.08) 

Number of observations 

1,859 

1,859 

1.859 

Multiple ft 2 

.382 

.382 

.382 


Nan — /-statistics art- in parentheses. 

* Other independent variables were location dummies. 

t The statistically significant coefficients for job complexity suggest that the industrial engineers uvet estimate the 
difficulty of performing complex jobs. 


a statistically nor economically significant effect on output, nor does 
matching better-educated workers with the more complex jobs 
significantly affect output. These findings suggest that the pecuniary 
reward to education for the workers in the sample is not due to skills 
learned in school that help those workers learn complex production 
tasks. 

The direct effect of education on output is marginally significant 
but does not seem large enough to explain the correlation between 
education and wages. In table 11 computed the effect of education on 
the logarithm of their real wages on their previous job for workers in 
plant A. The best estimate was that each year of secondary education 
had roughly a 3.7 percent effect on the previous wage of the workers 
in that plant. On the other hand, 1 find that each year of secondary 
education has roughly a 1.3 percent effect on the output of workers in 
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that plant. (This estimated effect is not statistically different frot 
zero. The mean value of job complexity for this subsample was 1.76 
Consequently, for the average job, high school graduation does no 
affect output.) 


IV. Effects on Wages 

Although one should be cautious about generalizing the results ir 
Section II beyond the sample of semiskilled production workers, the 
PSID provides evidence that the wage premium received by high 
school graduates is due, at least in part, to their low quit propensity 

There are both theoretical and empirical grounds for believing tha 
during business slumps firms hoard workers: pay wages above tht 
value of the product of the workers. (Fay and Medoff [1985] estimate 
excessive staffing levels of 4 percent for a typical manufacturing plan' 
during its most recent trough quarter.) Consequently, during busi¬ 
ness slumps one would expect firms to benefit from (or be hurt lest 
by) quits and absences. To obtain quantitative estimates of the effec 
of the low quit propensity and low absenteeism of high school gradu 
ates on their wages, I interacted county unemployment rates with 
high school graduation while estimating a wage equation for private!} 
employed males in the PSID data set between 1976 and 1982. 20 
included education, education squared, and education cubed as indc 
pendent variables to reduce the possible role of high school gradua¬ 
tion in approximating a precluded degree of curvature in the rela 
tionship between wages and education. 

As a further check on possible nonlinearities generating these re¬ 
sults, I replaced the dummy variable for high school graduation firs 
with a dummy variable for completion of eleventh grade and ther 
with a dummy variable for completion of the thirteenth year ol 
schooling. Neither of these dummy variables was statistically signifi¬ 
cant. Thus the results do not seem due to the dummy variable for 
completion of high school capturing additional nonlinearities in the 
relationship between education and wages. 

In table 9,1 estimate that at the mean county unemployment rate of 

20 Complete wage data are not available in the PSID data set for years prior to 1976. 
For those years it is possible to construct wages using reported earnings and hours 
worked. Duncan and Hill (1985, pp. 519-20) verified reported earnings calculated in 
this way with employer records and found that “errors in interview reports of average 
hourly earnings (defined as the ratio of interview reports of annual earnings to annual 
hours) were enormous." Consequently, I used only years in which wages were directly 
reported. Because wages reported were truncated for 1976 and 1977 (no one could 
report a wage above $9.99), 1 inferred wages for workers that reported a wage of $9.99 
by extrapolating back from their reported wage in 1978. 1 assumed that their percent¬ 
age wage increase was equal to the average in the sample, except that if the ex¬ 
trapolated wage was less than $9.99 ] assumed that it was equal to $9.99. 



TABLE 9 


Natural Log of Real Wages or Male Private-Sector Employees 
in PSIO Sample, 1976-81 


Estimated 

Coefficient 


Independent Variable* (1) (2) 


Intercept 

.051 

.054 

(5.62) 

(5.93) 

Education 

.078 

.077 


(2.87) 

(2.84) 

Education squared 

-.0062 

-.0062 

(-2.16) 

(-2.16) 

Education cubed 

.00029 

.00029 


(3.22) 

(3.23) 

Reported experience 

.028 

.028 

(13.71) 

(13.52) 

Reported experience squared 

-.00040 

- .00039 

(-12.09) 

(-11.86) 

Age 

-.0031 

-.003 

(-2.14) 

(-2.34) 

Disabled 

-.087 

-.085 


(-6.44) 

(-6.35) 

Married 

.084 

.082 


(8.25) 

(8.11) 

Part-time work 


-.081 



(-6.75) 

Nonwhite 

-.156 

-.159 


(-17.79) 

(-18.02) 

High school graduate 

.133 

.140 

(5.03) 

(5-12) 

High school graduate x county employment rate 

-.010 

-.010 

(-3.03) 

(-2.99) 

New hire (less than 1 year of tenure) 

-.110 

-.099 

(-11.88) 

(-9.48) 

Union member 

.169 

,167 


(21.08) 

(20.77) 

County unemployment rate 

.0015 

.0014 

(51) 

(.51) 

Average quits per year in the sample 


-.126 



(-2.28) 

(Average quits per year in the sample)* 


.118 


(1.96) 

High school graduate x average quits 


- .070 


(-1.63) 

Number of degrees of freedom 

9,517 

9,514 


Noti—/. statistics art in parentheses 

* Other independent variable* were eight location dummy variables, S-year dummy variable for each year, and an 
intercept term. Thu model was also estimated with tenure and tenure squared as independent variables. The 
coefficient* on high school graduate and high school timet unemployment were unchanged to two decimal places 
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6 percent, going from 11 to 12 years of schooling has three times the 
effect on wages as going from 10 to 11 years. At a zero unemploy¬ 
ment rate, completion of twelfth grade has more than four times as 
large an effect on wages as completion of eleventh grade. 21 At a 15 
percent unemployment rate, completion of twelfth grade has the 
same effect on wages as completion of eleventh grade. 

The procyclical behavior of the wage premium for completion of 
high school is especially surprising given the occupational distribu¬ 
tions of high school dropouts and high school graduates. High school 
dropouts are overrepresented among blue-collar workers: 54 percent 
of the blue-collar workers in the PSID sample were high school drop¬ 
outs, while only 10 percent of the white-collar workers were high 
school dropouts. Raisian (1983) finds that the wages of blue-collar 
workers relative to those of white-collar workers are procyclical. 22 
Hence the different occupations chosen by high school graduates and 
high school dropouts would lead the wage premium for high school 
graduation to behave countercyclically. 

Of course the procyclical behavior of the wage premium associated 
with graduation from high school may be due to a procyclical effect of 
education on wages. Unfortunately the data were inadequate to ob¬ 
tain meaningful estimates of the cyclical effects of both high school 
graduation and education on wages. 

Table 9 also shows that average quits per year that a worker was in 
the sample, when evaluated at the mean average quit rate of 0.11, has 
a negative effect on wages. This effect appears stronger for high 
school graduates than for high school dropouts. However, the statisti¬ 
cal significance is weak. The estimate obtained of the effect of the 
interaction between high school graduation and previous quits on real 
wages was sensitive to the particular independent variables included 
and the number of observations. When the model was estimated in¬ 
cluding constructed wage data from 1968 to 1976, the estimated 
coefficient (which as argued in n. 20 is likely to be subject to serious 
measurement error) was - 0.080 with a /-statistic of - 2.60 (the num¬ 
ber of observations was 13,604). On the other hand, when average 

21 The effect of high school graduation on wages includes the effect calculated from 
the continuous education variables for completion of twelfth grade as well as the 
coefficient on the dummy variable for high school graduates. 

22 I used the same classification scheme as Raisian (198$) in determining which two- 
digit occupations were blue-collar and which were white-collar. 

23 Note that the formulation I have chosen has the form In 1V ; , *= X,,|S + where 

•q,, m N(0, a s ). Coleman (1984) has argued against assuming that q„ is distributed inde¬ 
pendently and identically if one wishes to study cyclical effects. He argues persuasively 
that a more reasonable model is In W* = Consequently, the /-statistics 

for the ordinary least squares estimates of the cyclical variables should be viewed with 
caution. (However, most of the variance in county unemployment rates is due to cross- 
sectional rather than cyclical effects.) 
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iits was replaced by ln(average quits +1), the coefficient on high 
hool graduates times ln(average quits +1) fell to -0.072 with a t- 
atistic of - 1.31 while the coefficient of ln(average quits + 1) was 
0.081 with a /-statistic of - 1.63. 

. Corroborative Evidence 

'he PSID provides some evidence suggesting that the discontinuous 
lcrease in earnings associated with graduation from high school is 
ot due to a discontinuous increase in the cognitive skills of high 
chool graduates associated with their superior learning ability. This 
/ould be an implication of human capital models in which choices of 
evels of education are determined by the efficiency with which indi- 
iduals learn. 

In 1976 and 1978 respondents to the PSID were asked the training 
required for an average worker to learn their job. Better-educated 
individuals reported having jobs requiring more training. However, 
after correcting for the continuous effect of education, I did not find 
that high school graduates had jobs requiring more training. Since I 
had previously found a large negative effect of high school gradua¬ 
tion on quit propensities and quit rates (see tables 2-5), I would have 
expected that, if high school graduates and dropouts derive the same 
benefit from training, the former would have jobs requiring more 
training. The data in table 10 suggest that while employers may be¬ 
lieve that better-educated workers have a comparative advantage in 
acquiring job training, they do not believe that there are special skills 
in assimilating training associated with high school graduation. 24 

One problem with this interpretation of the data in table 10 is that 
individuals were asked how long it would take the average worker to 
become fully trained on their job. It is not clear how the respondents 
interpreted the phrase "average person.” They might interpret the 
average worker as someone like themselves. In that case high school 
graduates could be assigned to more complex jobs, but if they learn 
those jobs more rapidly, the two effects could cancel one another. 

In table 11,1 took a different approach to this question. 1 estimated 
the effect on wages of the interaction between high school graduation 
and on-the-job training. In column 1 the effect was estimated for all 
workers, while in columns 2 and 3 I considered only workers who had 

24 An alternative explanation for the data in table 10 is that skills learned in high 
school by high school graduates are substitutes for skills taught on the job. However, 
that explanation seems inconsistent with better-educated workers’ being assigned to 
jobs requiring more training unless it is only the skills associated with high school 
graduation that are substitutes for training, while the other traits learned in school 
complement the skills learned on jobs. 
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TABLE 10 

Years of Training Required for Job 
in 1976-78 

(PS1D Sample, Ordinary Least Squares 
Estimation) 


Independent Variable* 

Coefficient 

Intercept 

-2.46 

(-5.36) 

Education 

.310 

(2.13) 

Education squared 

-.24 

(-1.46) 

Education cubed 

.00119 

(2.10) 

Work experience 

.058 

(4.63) 

Work experience squared 

-.0010 

(-5.05) 

Age 

.033 

(3.80) 

Race = nonwhite 

-.87 

(-16.29) 

High school graduate 

-.02 

(-.23) 

College graduate 

-.37 

(-.197) 

Married 

.12 

(1.69) 

Disabled 

.012 

(.88) 

R u 

.1638 

Number of observations 

9,941 


Note.—M utinies are in parentheses. 

* The other independent variables were eight dummy variables 
for different areas of the country. 


completed the training needed to fully learn their job. When I esti¬ 
mated the coefficients listed in columns 1 and 2, I obtained the sur¬ 
prising (astonishing?) result that the continuous effect of education 
on the impact of on-the-job training on wages has the opposite sign as 
the effect of graduation from high school. (When I omitted the in¬ 
teraction between education and completed training, I estimated a 
negative coefficient for the effect of the interaction between high 
school graduation and completed training on wages.) 

On the other hand, the estimated coefficients in column 3 suggest 
that these results may be due to nonlinearities in the interaction be¬ 
tween education and returns to on-the-job training. In column 3 I 
allowed for a cubic specification of the interaction between education 
and on-the-job training and found that, for relevant levels of educa¬ 
tion, differences in education levels do not affect the impact of on- 
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Effect of Training on ln Wages of Male Private-Sector Employees in PSID 

Sample, 1976-81 


Independent Variable* 


Coefficient 


(1) 

(2) 

(3) 

Education 

.073 

.059 

.057 


(2.78) 

(2.09) 

(20.83) 

Education squared 

-.0064 

-.0056 


(-2.33) 

(-1.87) 


Education cubed 

.00030 

.00030 



(3.48) 

(3.12) 


Required training for job cur- 

.025 



rently held 

(8.00) 



Attained training (includes 

.056 



workers still in training) 

(5.57) 



High school graduate x attained 

.013 



training 

(2.07) 



Education x attained training 

- .0036 



(-3.98) 



High school graduate 

.116 

.105 

.057 

(4.43) 

(3.49) 

(2.01) 

High school graduate x county 

-.0093 

-.010 

- .0087 

unemployment 

Years of completed training 

(-2.86) 

(-2.82) 

(-2.36) 


.099 

.527 


(10.01) 

(7.49) 

High school graduate x years of 


.017 

.0045 

completed training 


(2.53) 

(.48) 

Education x years of completed 

. . . 

-.0049 

-.121 

training 


(-5.08) 

(-6.15) 

Education squared x years of 



.00975 

completed training 



(5.39) 

Education cubed x years of 

. . . 


-.00026 

completed training 



(-.487) 

Number of degrees of freedom 

9,510 

6,526 

6.526 


.522 

.548 

.543 


Note.—/- statistics are in parentheses. 

* The other independent variables were the same as those in col. 2 of table 9, except the interaction between high 
school graduation and previous quits was omitted. In col. S education squared and cubed were also omitted Hie 
sample size falls in cols. 2 and 3 because 1 included only workers who had completed their training. 


the-job training on wages, nor does there appear to be a discontinu¬ 
ous change in the impact of on-the-job training on wages associated 
with graduation from high school. 

Note that the coefficient of required training in column 1 is 0.025 
with a {-statistic of 8.00. This coefficient is consistent with the evidence 
presented in Weiss (1984) that workers on more complex semiskilled 
production jobs have higher quit rates, suggesting that workers dis¬ 
like those jobs and hence need to be rewarded with a compensating 
wage differential. It is also consistent with efficiency wage models that 
predict that firms pay higher wages in jobs for which the returns to 
effort or to ability are highest. 
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There are several reasons why the estimated coefficients obtained 
in all three columns of table 11 should be viewed with particular 
caution. First, the range of realized education levels is fairly small, 
and education is allowed to affect wages in a wide variety of ways 
leading to serious problems of multicollinearity. Second, the process 
by which workers choose or are assigned to jobs is not known. The 
resulting simultaneous equation bias could be seriously affecting the 
estimated coefficients. Third, as noted in the discussion of the results 
in table 10, it is not clear what the training requirements reported by 
workers are actually measuring. These problems are added to the 
usual problems of model misspecification, misreporting of wages, and 
omission of fringe benefits in measures of reported wages that are 
likely to affect the estimated coefficients in table 11. 

VI. Sample Selection Bias 2S 

As discussed above, there are persuasive reasons for believing that 
sample selection bias is not a serious problem for the analysis. How¬ 
ever, to the extent sample selection bias is a problem, the results 
obtained above are stronger than would be indicated from the esti¬ 
mated coefficients and /-statistics. 

Since only a trivial number of applicants were rejected for reasons 
other than low scores on the dexterity test, I shall restrict the discus¬ 
sion in this section to the biases introduced from the application deci¬ 
sion of workers. Thus we shall consider a model in which the sorting 
effect of wages differs across different groups of workers (see Weiss 
1980). 

Consider an unobserved application equation. A worker applies for 
a job with this firm if and only if 

X«Vo > C|> (10) 

where Xc> includes all observable characteristics of the worker and C| 
includes unobserved characteristics of a worker as well as random 
noise. Having applied and been accepted for this job, the worker 
subsequently quits if and only if 

XoTh > ^((Xo'Yo > *i) (11) 

and is absent if and only if 

X</Yi! > €s|(Xo7o > e i)- (12) 

Because better-educated workers had higher wages on their previ¬ 
ous jobs and generally have higher reservation wages when unem- 

2S The material in this section is a direct application of the seminal treatment of these 
issues in Heckman (1976). 
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ployed, we would expect the coefficients on education and high 
school graduation in (10) to be negative. Since (10) is not estimated, 
within the sample we would expect ti and education to be rfegatively 
correlated. It seems reasonable to expect that, at least for employed 
applicants, ej is positively correlated with €2 and A worker who was 
easily induced to leave his previous job is likely to be easily induced to 
leave his current job, and a low level of job commitment might also be 
expected to lead to a higher than expected probability of being ab¬ 
sent. Therefore, we expect €2 and 63 to be negatively correlated with 
education in equations ( 11 ) and ( 12 ). 

Hence, for this sample there is positive bias on the estimated 
coefficients on education and high school graduation in the quit and 
absenteeism equations, resulting in an underestimate of the mag¬ 
nitude of the negative correlation between high school graduation 
and the propensity to be absent or to quit. (In the market equilibrium 
it is the unbiased correlation between high school graduation and quit 
or absenteeism propensity that is relevant for explaining the positive 
correlation between high school graduation and wages.) 

VII. Review of Results 

Recent large-sample studies have shown returns to education in the 
region of 4-5 percent. There is a not dissimilar rate of return for the 
pay on their previous job for workers in the proprietary sample I 
assembled. A large component of this rate of return is the discontinu¬ 
ous increase in wages associated with high school graduation. 

Presumably the higher earnings of high school graduates are due to 
their better performance relative to high school dropouts. I inves¬ 
tigated four components of performance: output per hour, compara¬ 
tive advantage in more complex jobs, propensity to quit, and propen¬ 
sity to be absent. 

For the sample of semiskilled production workers, high school 
graduation appears to be uncorrelated with output per hour, and 
high school graduates did not appear to have a comparative advan¬ 
tage in more complex production jobs. On the other hand, high 
school graduates were significantly less likely to quit or to be absent. 
The wage premium received by high school graduates in the PSID 
sample appears to be procyclical. Consequently, if a low quit propen¬ 
sity and a low rate of absenteeism are more valuable during booms 
than slumps, these data suggest that a considerable fraction of the 
estimated return to high school graduation is due to unobserved traits 
that are associated with that credential. 

Finally, I found that high school graduates in the PSID do not have 
jobs requiring more training than would be expected from a continu¬ 
ous relationship between education and required training. 
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A Description of the Proprietary Data Set 

Each of the workers in this sample was initially paid according to the same 
nonlinear piece rate. The form of this pay schedule was such that for newly 
hired workers wages were a convex function of output. When a worker 
achieved 83 percent of expected output for 1 month, he or she was assigned 
to a pay group. All the members of a pay group received the same pay. Pay 
was proportional to the output of the group—the average group size was 126 
members—and promotion opportunities were insignificant. Thus there was 
little financial incentive for group members to achieve high levels of output. 
Consequently, I used the output of workers only during their first month on 
the job in this study. (Among experienced workers, the range in output from 
the bottom to top deciles was less than 20 percent of the mean.) 

Almost all workers were assigned to a pay group within their first 3 months 
on the job. Therefore, expected lifetime earnings differences among the 
newly hired workers at each location were trivial. 

For each worker data were available on sex, age, marital status, education, 
employment status when he applied for work, an estimate of the time re¬ 
quired for an average employee to learn the worker’s job as well as his output 
per hour (measured in physical units and normalized by the industrial en¬ 
gineering force to be equivalent across jobs), number of days absent, number 
of occasions absent (consecutive days absent are a single occasion), whether or 
not the worker quit, and the date a quit occurred. The workers did not know 
that they were going to be subjects of an empirical study. All the data were 
routinely collected either by the personnel office, industrial engineers, or 
foremen at the three locations. Although the total sample contained 2,920 
individuals, complete data were not available for all workers. In some cases 
workers were assigned jobs for which there were no measures of job complex¬ 
ity; in other cases the worker failed to answer all the questions on the applica¬ 
tion form. Also at each location different information was available in the 
personnel records. In location C, race data were not available and only the 
screws half of the Crawford Physical Dexterity Test was administered; in 
the other two locations both the pins and screws sections of the dexterity test 
were administered before a worker was hired. The scores on this test were 
used by the firm in deciding whether to hire the worker but were not used in 
assigning workers to jobs. Job assignments were random. 

An additional problem arose at location C: the plant was divided into two 
halves, with significantly different promotional opportunities. Unfortunately 
the data did not reveal to which half of the plant particular workers were 
assigned, that initial assignment may not have been random. At that plant, 
workers assigned to more complex jobs had better promotional opportuni¬ 
ties. Because of these unobserved differences in promotion opportunities, the 
relatively small sample size at that location, and the missing data on race, 
prior experience, employment status when they applied for this job, and 
scores on the pins section of the dexterity test, I did not use data from that 
location in estimating quit equations. 

Table A1 presents some summary statistics that provide a context within 
which to evaluate the data. 
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TABLE A) 

Mean Values of the Relevant Variables 



Plant A 

Plant B 

Plant C 

Percentage white 

.72 

.76 

NA 

Percentage black 

.24 

.20 

NA 

Age 

24.6 

25.3 

26.7 

(7.52) 

(7.07) 

(8.23) 

Education 

12.1 

12.2 

11.9 


(1.20) 

(1.26) 

(.98) 

Percentage employed at time of application 

.64 

.29 

.38 

(1 if employed, 0 otherwise) 




Percentage male 

.57 

.19 

.41 

Percentage married 

.43 

.48 

.42 

Tenure on previous job (in years) 

1.56 

NA 

NA 

(2.04) 



Previous work experience (in years) 

3.54 

NA 

NA 


(7.74) 



Weeks to learn job 

7.5 

5.1 

11.7 


(4.9) 

(2.4) 

(4.4) 

Score on screws test 

22.6 

21.3 

24.0 


(5.08) 

(4.20) 

(4.02) 

Score on pins test 

22.4 

23.5 

NA 

(4.10) 

(3.64) 


First-month output 

111 

63 

105 

(19.7) 

(22.2) 

(29.2) 

Percentage days absent 

2.96 

2.3 

3.0 


(6.53) 

(4.23) 

(4.38) 

Percentage occasions absent 

1.2 

1.1 

1.12 


(4.5) 

(1.4) 

(1.1) 

Quit rate in first 6 months 

9.8 

18.2 

12.3 

onjob (%) 





Note.—S tandard errors are in parentheses below the relevant variables. NA means data were not available. For 
the subsample used to estimate the output equation. 56 percent were in plant A. 54 percent in plant B. and 10 
percent in plant C. 


Appendix B 

To evaluate the cost of absenteeism to firms, the nature of the production 
process is critical. Traditional economic analyses of absenteeism assume that a 
worker’s marginal product is equal to his wage and the cost of absenteeism is 
the wage of the worker. In this Appendix, we will consider the case in which 
the production process used by all firms requires k workers to operate. If 
more than k workers are present, the extra workers are redundant; they do 
not increase output. If fewer than k workers appear, output is zero. 

To simplify the notation, assume output is linear in the number of workers 
and normalize the value of output to be equal to k. The number of workers 
hired is denoted by n, the wage of each worker by <u, the probability that a 
worker appears for work by p, and the expected profit of a firm employing n 
workers by ir(n). Let P(S n S k) denote the probability that at least k workers 
are present when n workers are hired. Finally, assume that the absenteeism 
rate of each worker is common knowledge and that workers are not paid if 





8i8 


JOURNAL OF POLITICAL ECONOMY 


TABLE B1 

Relationship between Absenteeism and Wages 


Probability 

of 

Being Absent 

Profit-maximizing 
Number of 
Employees 

Equilibrium 
Daily Wage 

.20 

30 

197.66 

.19 

30 

198.42 

.18 

29 

199.51 

.17 

29 

200.43 

.16 

28 

201.46 

.15 

28 

202.64 

.14 

27 

203.47 

.13 

27 

205.07 

.12 

27 

205.56 

.11 

26 

207.70 

.10 

26 

208.60 

.09 

25 

210.53 

.08 

25 

212.05 

.07 

24 

213.59 

.06 

24 

216.02 

.05 

23 

217.02 

.04 

23 

220.87 

.03 

23 

222.08 

.02 

22 

227.61 

.01 

21 

231.60 

.00 

20 

250.00 


they are absent (this assumption is made so that we can focus on the indirect 
costs of absenteeism rather than the direct cost of paying for an absent 
worker): ir(n) = kP(S„ a k) — npu>. The expected marginal product of the n + 

1 worker is £[/’($„ +1 a k) - P(S„ a A)]. The net expected value of the mar¬ 
ginal worker is 

ir(n + 1) - ir(n) = kP(S„<.\ a A) - (n + l)/>w - kP(S„ a A) + npu> 

• pkP(S„ a A - 1) + (1 - p)kP(S n a A) 

- kP{S n a A) - pw (Bl) 

= pk[P(S n a A - 1) - P(S n a A)] - p»> 

« p[kP{S n - A - 1) - »]. 

The expected value of the marginal worker is the probability that the worker 
is decisive (enables the plant to operate) times the value of the production 
process, minus that worker’s expected cost to the firm. 

A firm increases the number of workers it employs as long as (Bl) is posi¬ 
tive; it chooses the smallest n such that 

pk - 2 - pi(i -p) n ~i - pu < 0, (B2) 

(« ~ j)j 


where j = A — 1. 
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If we consider the case in which workers are paid even if they are absent, 
the usual policy in U.S. firms for routine levels of absenteeism, (BI) is rewrit¬ 
ten as pkP(S n = k - 1) - <0 and (B2) as 

pk— n - -p’(\ - p) n ' j - u><0. 

~ ])] 

To illustrate the effect of absenteeism on the equilibrium value of a worker 
to the firm, let us consider a production process that requires 20 workers to 
operate, generates $10,000 of income per day if it operates, and has a fixed 
cost of $5,000 per day whether it operates or not. We shall assume that 
workers are not paid if they are absent. In equilibrium, if the probability of a 
worker’s being absent is known by all employers, firms compete for workers 
so that wages are bid up to the level at which each production process earns 
zero profits, and all the workers at a given plant have the same absenteeism 
rate, the relationship between absenteeism and wages shown in table B1 fol¬ 
lows. These calculations suggest that, even if employees are not paid when 
they are absent, in competitive markets employers might pay significantly 
higher wages to workers with lower probabilities of being absent. 
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General Equilibrium with Real Time Search in 
Labor and Product Markets 


Birger Wernerfelt 

Northwestern University 


The paper is concerned with economies in which agents find sellers 
and employers in a time-consuming search process while they simul¬ 
taneously trade with their current partners. A symmetric steady- 
state equilibrium does not exist, but asymmetric steady-state equilib¬ 
ria exist and are such that larger firms offer higher wages and 
charge lower prices than smaller firms, but still make more profits. 
These profits can be seen as rents from a superior market position. 


I. Introduction 

The Walrasian auctioneer and tatonnement process, which eliminates 
out-of-equilibrium trades, is central to traditional microeconomics. 
Except for very few organized markets, it is, however, not a realistic 
conception of actual market processes. Nor is it representative of the 
way economic agents view modern society: businesspeople often talk 
about market share as an asset in itself and sometimes look at advanta¬ 
geous factor market positions as the key to their competitive advan¬ 
tage. To the extent that it is appropriate to study social processes in 
the categories of the involved individuals, this could be seen as an 
undesirable feature of the Walrasian tradition. 

While it is widely recognized that real market processes exhibit a lot 
of trade on the way to Walrasian equilibria, there is very little work on 
the reasons for and implications of this. In fact, most models of search 
do not operate in “real time" in the sense that agents are assumed to 
finish collecting information before the trade. The literature on truly 
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dynamic effects has been based on the original insight of Arrow 
(1959), according to which the time needed to search out alternatives 
produces lags in buyer shopping responses such that sellers have 
“dynamic monopoly power" (see also Diamond 1971). This has been 
investigated in partial equilibrium models of employment turnover 
with on-the-job search (Burdett 1978; Jovanovic 1984) and (simulta¬ 
neously with the present effort) in a static partial equilibrium model 
by Mortensen (1985). 

This paper extends the arguments above to a general equilibrium 
model in which each agent has both a labor and a consumer side. I 
find that symmetric steady-state equilibria do not exist whereas asym¬ 
metric steady-state equilibria exist and are such that larger firms offer 
higher wages and charge lower prices than smaller firms but still 
make more profits. These profits can be seen as rents from a superior 
market position. 

II. Model 

I consider the properties of steady-state equilibria in a production 
economy in which firms post wage and price off ers and agents engage 
in time-consuming but otherwise costless search. I will be able to 
characterize the viable wage-price offer strategies, the profits associ¬ 
ated with each, and the size distribution of firms. The model is based 
on ex ante identical consumers and endogenous firm formation, such 
that only the randomness of the search processes generates strategic 
heterogeneity. 

I advise the reader that the product side of the economy would be 
identical to that of Mortensen (1985) if the discount rate were zero. 

A. An Atomless Production Economy 

Let us look at an atomless economy with overlapping generations in 

infinite-horizon discrete time. In all periods t - 1,2.there is a 

unit measure of identical agents, each of whom is endowed with a 
single indivisible unit of effort per period. This effort may be spent 
either on labor or on leisure. The utility of leisure is normalized to the 
rate one, and an individual’s life expectancy is independent of how he 
spends his time. The only other good in the economy is also valued at 
one utile per unit and is produced by firms from units of labor ac¬ 
cording to a differentiable function q('), which expresses the measure 
of labor units required to produce one unit of the good as a function 
of the contemporaneous measure of labor units employed in the firm. 
Assume that q{-) is decreasing and convex. As its argument goes to 
infinity, q approaches l/y, which is strictly less than one, such that the 
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economy is productive. Assume that individual agents can be re¬ 
garded as wage and price takers. Firms may be formed costlessly by 
any agent, and each firm produces at full capacity and satisfies as 
much demand as possible while the owning laborer consumes any 
surplus units. 1 Assume that firms are passed on to an inheritor of the 
owner such that they can exist in perpetuity. 

Newborn agents are ignorant of labor and buying opportunities but 
can costlessly receive random, independent offers of each, at rates p 
and X. The sequence of events in a given period is as follows. First, 
agents who have received both a wage and a price offer in the past 
select a prospective employer and prospective seller to go to. Second, 
each firm announces a single wage to the agents who turned up to 
consider working and a single price to those who turned up to con¬ 
sider buying. Third, agents who work and buy become "active’’ labor¬ 
ers and buyers if and only if the wage announced by their employer is 
at least as large as the price announced by their seller. Fourth, a 
randomly chosen fraction t of all agents die. Fifth, a measure t of 
agents are born. Sixth, each agent observes one additional randomly 
chosen wage with probability p and one additional randomly chosen 
price with probability X. (Since the more attractive prices and wages 
will be offered by more productive firms, price search needs to be 
made faster than wage search, so assume that X > p.) Note that 
observing a firm’s wage does not entail observing its price, and vice 
versa. Agents can recall all wage and price offers they have observed 
and decide which firms to approach in a given period on the basis of 
their expectations about the firms’ actions in the period. All agents 
maximize expected lifetime utility using the discount factor 6 < 1, and 
it is not possible to store either units of labor or units of the consump¬ 
tion good. 2 3 

Firms will not be required to offer constant wage-price pairs, but we 
will focus on equilibria in which they do this. In such equilibria, an 
agent’s situation in any period is summarized by the highest wage and 
the lowest price, if any, he has observed so far. Because search is 
costless, all agents will search all the time, working themselves up the 
wage distribution and down the price distribution. Since these pro¬ 
cesses are furthermore independent, the period l density of agents at 
wage w and price p can be expressed as the product of the marginal 
densities h(w) and k(p). 

The situation of a firm in any period is summarized by the measure 

1 The objective of the firm will then be maximization of discounted surplus, which 

differs from discounted profits if firms are heterogeneous. While this formulation is 
convenient, it is, as far as I can tell, not crucial for the essence of the results. 

3 In the steady-state equilibrium, storage will never be optimal, but this is deceptive 
since the equilibrium is supported by a no-storage assumption. 
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of its pool of laborers, l t , and its pool of buyers, b t . Because search is 
costless, each firm has the same distribution of agents in these pools, 
and the translation to active laborers l t and active buyers b t is given by 

fWi 

I, - l, J Q k,(x)dx, 

if the firm offers w t and p t , respectively, and agents believe that these 
offers are constant over time. Since an active buyer who makes xv t 2* p, 
will buy w,/p, units, the revenues of the firm are xk,(x)dx, while 
production costs are w/Jo* k t (x)dx. The fact that the owner consumes 
surplus output can be expressed by the budget balance condition 



fw t 

wh I k t (x)dx = 



(1) 


Before we look at the properties of steady-state equilibria of this 
economy, it is helpful to note that the perfect information equilibrium 
is one in which all firms operate at minimum efficient scale while the 
wage-price ratio is y. Prices will be normalized such that the full- 
information price is one. 


B. Existence Results for Steady-State Equilibria 


A steady-state equilibrium of the model described above is a situation 
in which (i) agents follow the optimal labor-buyer-search strategies, 
(ii) the distribution of laborers and buyers is time invariant, (iii) firms 
make time-invariant wage and price offers that maximize discounted 
surplus production, assuming constant wage-price offers of other 
firms, and (iv) no new firms are formed. 

As the search process is specified, higher wages and lower prices 
will command larger steady-state pools of laborers and buyers, so that 
if two firms offer the same wage (price), they will also offer the same 
price (wage) in equilibrium. Accordingly, a steady-state equilibrium is 
completely characterized by four time-invariant functions from R + to 
R +: (i) the density of the laborers over wages, h(w); (ii) the density of 
buyers over prices, k(p)\ (iii) the measure of firms over wages, call it 
f(w)\ and (iv) the measure of firms over prices, call it gip). The linkage 
between wage and price offers can be found from the market level of 
aggregation of (1): 

wh t (w) £ 

which implicitly defines w(p\h, k) or p(w\h, k). We therefore have the 
equilibrium condition 

g,(p(w\h t , k,)) = f t (w) or g,(p) = f t (w{p\h t , k,)). 


k t {x)dx = k t (p) [" xh t (x)dx, 
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In a steady state, in which laborers and buyers expect constant 
offers, the optimal search strategy entails constant search and myopic 
adoption of any improvement. In this case, the market-level implica¬ 
tions are 


h l+ i(u>) - h,(w) = [ jp~ — ^ ~ + I, hffidx^Lf(u) 
- A f (a»)J^T + p £ f,(x)dx j 


( 2 ) 


and 


k l+i (p) - k t (p) = [ (/j. ^ + A + \p *»(*)<**] 

- k,(p)^r + A | o £,(*)<& j 
such that the steady-state conditions are 

i iT- ill ' l l + 1 - *■<”>[* + 


(3) 


and 


[a ^xjVTx + [ M»*]W> - «/»[’ + * [' 


From (1) the surplus of a firm characterized by l and b can be 
expressed as 

b t xA,(x)rfx^ju^|/, £ A,(x)dxj j - p e ~'j 55 ■*(/„ b„ w„ p,). 

Accordingly, the maximization problem facing firms is to find se¬ 
quences w(t|/o, b n ) and p(t\lo, bo) that satisfy (1) and 


30 

max X SW,, b(, w ( , pi)dt, 

f-0 

4 *>" 4 “[ "(i + 1 

- /, Jr + p f,{x)dx j ■ L,(l„ u> t ), 

”•*'< - *» m [ <i--Iy + x + 1 i 

- b t |t + A £ g<(x)dxj ® B,(b„ p t ), 


(4) 


(5) 


( 6 ) 
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where n is the measure of firms in the economy. For this model the 
following theorem applies. 

Theorem 1. There are no steady-state equilibria in which /(•) and 
g(-) have («) mass points or (b) unconnected support. 3 

Proof. See the Appendix. 

An important implication of this is the following corollary. 

Corollary . Symmetric steady-state equilibria do not exist. 

Inspection of the proof of theorem 1 gives an idea about the critical 
assumptions: if there are direct or opportunity costs of search or 
switching costs, the argument does not go through. If it is assumed 
that 8, X, p, t, and q(-) are such that the control problem (1) and (4)- 
(6) satisfies its second-order conditions, the following theorem ap¬ 
plies. 

Theorem 2. There exists an asymmetric steady-state equilibrium. 

Proof. See the Appendix. 

We can find n from the requirement that no more firms are 
formed. This can be done by considering the maximization problem 
for a firm for which / and b initially are zero. Let us denote the 
smallest and largest wages (prices) by a and (5, respectively, and use 
subscripts to denote partial derivatives. We then get, after consider¬ 
able manipulation, 

-rrfc£(|3)X 4- 7T//i(a)|x = 0. (7) 


C. Properties of Asymmetric Steady-State Equilibria 

The first important property of equilibrium is that the firms offering 
low wages offer high prices, and vice versa. Heuristically, this hap¬ 
pens because firms with large (small) buyer pools need large (small) 
labor pools. Note further that the highest (lowest) price in the market 
is equal to the highest (lowest) wage such that the optimal strategies lie 
on a curve in the box [a, 0) 2 . This is illustrated in figures 1 and 2. This 
wage and price pattern implies that the largest firms lie on the highest 
cost curve while smaller firms lie on lower cost curves but are so 
inefficient that their net costs are higher. This is illustrated in figure 3. 
We can use equation (1) to compute the profit margins for various 
( w , p) pairs and to convince ourselves that markets will clear at each 
price, taking into account the surplus consumed by the laborer who 
owns the firm. 

It is possible to interpret the equilibrium in almost Newtonian 
terms because of analogies to the constancy of energy. Consider the 

3 If q(-) is U-shaped, we need to be concerned about a degenerate equilibrium on the 
upward-sloping part of q. 
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Fig. 2. —A sample solution 
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equilibrium curve in figure 4. The largest firms have the highest 
"potential energy” by virtue of large fs and b's. They could milk this 
position by lowering wages or increasing prices. The differential sur¬ 
plus from having this position is, however, sufficiently high to make 
such a milking strategy unappealing. On the other hand, it is so ex¬ 
pensive to move up to this position that the involved costs just prevent 
smaller firms from moving up. So the costs or benefits from moving 
along the equilibrium curve are such that no one wants to move. The 
value of being a firm of type x plus the costs or benefits of moving to 
position y are equal to the value of being a firm of type y. In case the 
interest rate goes to zero, these mobility barriers disappear and the 
profits go to zero throughout. 4 Note finally that (7) gives the entry 
condition such that ex ante surplus is zero for an outsider. So while 
incumbent firms earn surplus, entrants would have to pay a fair price 
for it in the entry process. 

III. Conclusion 

Since the results of the paper are summarized in the Introduction, I 
will not repeat them here. Instead, I will emphasize that models of 

4 In this case o —* 1 and 0 —► while the size of the largest firms approaches 

minimum economic scale. 
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real time search depict a good market position (be it in product or 
factor markets) as an asset that may yield very significant quasi rents. 
Markets are networks of more or less durable long-term relationships 
rather than auctions. Since one does observe such trading patterns in 
the economy and since the equilibria in these models give rise to quite 
reasonable macro predictions, there would seem to be a case for con¬ 
tinued work in this area. 

The wage and price dispersion in this paper is generated by the 
random nature of the search process even though all agents are iden¬ 
tical ex ante. The same idea is exploited in a recent model by Al¬ 
brecht, Axell, and Lang (1986), which exhibits quite different results. 
In particular, they show the existence of several equilibria with a finite 
number of wage-price pairs. This result follows from their assump¬ 
tion that there is no on-the-job search and no utility of leisure. Ac¬ 
cordingly, if a firm deviates a bit from, say, a two-price, two-wage 
equilibrium, it need not get more buyers or laborers. It draws only 
from the pool of uncommitted agents, and if the opportunity costs of 
search are high and “few” firms are expected to offer better deals, the 
members of this pool may not search more because of the deviation. It 
should also be mentioned that Albrecht et al. constrain firms to con¬ 
stant wage-price offers, so the possibility of exploiting dynamic mo¬ 
nopoly power is ignored. 

While the mathematical difficulties may be nontrivial, it would be 
desirable to consider the possibility of non-steady-state equilibria in 
the model. On a different level, the present model could be general¬ 
ized to incorporate switching costs, costly search, advertising, and so 
forth. 
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Proof of Theorem / 

(a) If f has a mass point, an infinitesimal wage increase by any of the involv 
firms will give that firm a discrete upward jump in its steady-state pool 
active laborers. Since q is decreasing, such an action will increase surplus, (b) 
there is a gap in the support of /, firms immediately above the gap cou 
benefit from moving immediately below it. Q.E.D. 


Proof of Theorem 2 

In a steady-state equilibrium agents maximize expected lifetime utility - 
behaving as described in (2)-(3) or (4)-(5). We can show existence by assut 
ing this behavior and demonstrating that the game (1) and (4)—(6), play 
between a fixed measure of firms, has a steady-stale equilibrium. 

The proof is structured as f ollows. We will first use (1) and various accour 
ing identities to simplify the game such that the players select functio 
Wf(/i|A, + ). where hf is an infinite sequence h,, h,+ 1 , ... of densities of labc 
ers over wages. Consider a sequence of identical such densities, called I, 
Given A 0 , there is a function ©(IJA 0 ) that will keep a firm's / constant. Similarl 
there is a function that will maximize (4) given (1), (5), and (6). T1 

idea in proof is that one can select h° such that u>(/|/r) = Since (2) ar 

(3) are aggregates of (5) and (6), these strategies will generate the sequence h 

For starters, theorem 1 states that all relevant functions are jointly contim 
ous. In particular, this is true of h, k, ir, L, and B. Now let us use (1) to fir 
pi(w,\l„ b„ hi, k,). Further, knowing that a>(/) is monotonous, we can construct i 
inverse l(w) and use the accounting identity fn = hit to rewrite the steady-sta 
conditions 


(1 ~ 

(I - p.)T + p. 


+ 




r*feU£i. 

)v l(x) n J 


Similarly, we can construct b(p) to get 


(1 ~ X)T 
(1 - \)T + X 




b(x) « J 


(5 


(G 


Given h and k, for any l' there exists a unique b' for which the solution to (5 1 
is identical to the solution to (6’), w{p(b')). In the following we w 
restrict our attention to pairs l, b that satisfy this relationship. Hence we ca 
suppress b. Similarly, the relationship b'(T) defines a unique k for any h, so v 
can work with sequences of h alone and suppress the associated k’s. T1 
problem is now somewhat simpler since we can perform the analysis on u>, 
and h. 

Define M as the space of integrable functions from R + to R +, each of whic 
is positive on a single interval [a, 0] C R + (a < 1 < P) and zero elsewhere; M 


s For games of this type, Jovanovic and Rosenthal (1986) show the existence of > 
equilibrium in which the state-action distribution is constant. Unfortunately, this dc 
not rule out “cycling,” so although their equilibrium is stationary in one sense, it is » 
steady state in this sense. 
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is the subset of M that is continuous and monotonous on [a, $}. Given any 
member of M, say A', (5') identifies a unique /(w|A'). For any l(w{h') £ M mi (5') 
is an ordinary differential equationof the first order in A. So (5'j maps M onto 
We can denote the inverse of I by W(l\h'). This is clearly continuous in A. 

If we substitute f(x) = h(x)/[l(x)n] into (5) and (6), the game (1) and (4)-(6) 
becomes a set of control problems for any infinite sequence of identical densi¬ 
ties h° ~ A*, h *,... , A* £ Af. Given A 0 , standard results tell us that (1) and (4 )- 

(6) has a solution wf (4|A°), t — 1,2.Similarly, because w* is a continuous 

function of A* and integrals of A*, we can find densities A** e Af such that 
A**,...) = u)(l) for all l(w) £ M m . To do this, take the interval [a, fi] 
from i'(w). After this, ui*(t[h) = w(l) becomes another ordinary differential 
equation in A. We can define C as the correspondence from l(w) to A**. From 
the second-order conditions and the monotonicity of w(l), we know that the 
control problems are well behaved such that the optimal policy is continuous 
in the exogenous functions. Therefore, C is upper semicontinuous. Any fixed 
point of C 0 f(w|A) identifies an A for which the optimal actions of each player 
(a) keep the state of that player constant and (A) preserve the aggregate state 
distribution. 

The existence of such a fixed point can be guaranteed by the Fan- 
Glicksberg fixed-point theorem (Fan 1952; Glicksberg 1952). We have al¬ 
ready established upper semicontinuity, and it is straightforward to verify 
nonemptiness and that C is closed- and convex-valued. To check that 
C 0 l(w\h) maps M into M, start in M and apply l to get into M m , from which C 
maps into M again. Since Af is convex, we are done. Q.E.D. 
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The employment relationship with employees’ ability and their ac¬ 
tions both private information (thus combining adverse selection 
with moral hazard) is modeled as a repeated game with self- 
enforcing contracts being perfect Bayesian Nash equilibria. Under 
termination contracts, the equilibrium contract structure consists of 
a hierarchy of ranks, finite in number even though ability is continu¬ 
ous. Reputation acts as an effective device for worker discipline with¬ 
out the need for involuntary unemployment. Selection by bonding is 
not, in general, incentive compatible, but selection by promotion of 
employees through the ranks is. Many other features correspond to 
observed employment structures. 


I. Introduction 

It is widely recognized that moral hazard and adverse selection pose 
serious problems for the operation of labor markets. We argue here 
that their interaction with fixed wage contracts generates employment 
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structures that have many features of those actually observed, espe¬ 
cially in large organizations. Incentive requirements result in a hierar¬ 
chy of ranks or grades within the organization, with ability and per¬ 
formance increasing with rank. The wage also depends on rank, but 
not directly on performance, and the endogenously determined num¬ 
ber of ranks is finite even though ability is a continuous variable and 
there is no financial cost to establishing more. This hierarchy, unlike 
that in Calvo and Wellisz (1979), Rosen (1982), Waldman (1984), and 
the Marschak and Radner (1972) theory of teams, does not arise 
because employees at different levels perform different exogenously 
specified tasks, though since the hierarchy sorts employees by ability, 
it is clear that in a more general model different tasks might be as¬ 
signed to different ranks. 1 The hierarchy thus does not consist of 
exogenously given tasks but of endogenously generated pay, per¬ 
formance, and ability of the kind that has been made explicit in the 
employment structures of many large Japanese companies, with em¬ 
ployees’ rank and status distinct from the particular task to which they 
are assigned (see Dore 1973). 

In employment an important element of moral hazard arises from 
the practical difficulty of finding measures of an employee’s perfor¬ 
mance that can be verified in court. Without such verifiability, con¬ 
tracts with the wage conditional on performance (piece rate contracts) 
will not be legally enforceable, with consequent problems for provid¬ 
ing incentives that are of great concern to managers. Recognition of 
this underlies the insights of Alchian and Demsetz (1972), first, that 
one of the central roles of firms is monitoring employees to ensure 
that they work and, second, that this role cannot be replaced by a 
piece rate contract of the type derived in standard principal-agent 
models because, when there is a genuine team element in production, 
the output of any one employee is not separately distinguishable from 
that of other employees in a way that can be readily verified in court. 
Indeed, the empirical limitations of standard principal-agent models 
are now recognized in the literature. As Hart and Holmstrom (1987) 
note, “the extreme sensitivity [of contracts] to informational variables 
that comes across from this type of modelling is at odds with reality." 

To this moral hazard problem, we add the adverse selection prob¬ 
lem that arises when new entrants to the labor market know more 
about their own ability or disutility of effort than potential employers. 
(In the formal model, the relevant parameter can be interpreted as 
either ability or disutility of effort. We use the term “ability” to stand 
for both.) This, too, is clearly an important aspect of labor markets. 

1 See MacDonald (1982), however, For an assignment model somewhat different 
from those cited. 
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The complex selection procedures adopted by many employers be 
witness to both the cost and the importance of acquiring informati< 
of this sort. 

Malcomson (1981) and Shapiro and Stiglitz (1984) have argued th 
contracts with the wage independent of performance can be used 
provide incentives for employees under this type of moral hazar 
What stops employees from doing no work while still collecting t 
wage is the threat that the contract will be terminated, and fewer goi 
opportunities be available in the future, if the employer is dissatisfie 
hence the name “termination contract.” As Shapiro and Stiglitz c 
serve, many employment contracts take this simple form. Often th 
are not even written down. 

Reputation has an important dual role here. It provides infortr 
tion to potential employers about the quality of employees and alt 
because poor performance results in loss of reputation, gives an i 
centive to employees to perform satisfactorily. In the latter role, 
removes the need for the involuntary unemployment that is requir. 
to ensure satisfactory performance in Shapiro and Stiglitz (1984). i 
the former role, the combination of reputation with the rank stru 
ture overcomes the barriers to labor mobility noted by Greenwt 
(1986) that arise from adverse selection when the present employ 
knows more about an employee’s ability than other potential emplo 
ers. As a result, employees not fired from their previous job have tl 
potential to change jobs without harming their reputation for abilit 
as many appear to in practice. 

In the present model employees cannot be induced to self-select tl 
appropriate rank by being required to post bonds to guarantee pe 
formance. Bonds low enough to induce the right employees to select 
rank are too low to prevent lower-ability employees from selectit 
that rank and then shirking. But they can be induced to self-select I 
a promotion scheme based on performance because high-ability er 
ployees have a comparative advantage in performing well. Thus tl 
model can generate job ladders and promotion even in the absence 1 
human capital acquired on the job. It thus differs from the adver; 
selection models of Salop and Salop (1976), Calvo and Wellisz (1975 
and Guasch and Weiss (1981), in which the contracts offered by er 
ployers induce immediate self-selection. Incentive compatibility of tl 
promotion structure is maintained by ensuring that it is optimal f< 
employees to quit rather than accept demotion with their preset 
employer, which provides a theoretical rationale for this labor mark 
convention. 

During the period they are being promoted, employees receive 
wage less than their marginal product, but once they reach the rar 
appropriate to their ability, the two are equal. Hence the wage in tl 
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early part of their working lives is lower relative to marginal product 
than later on. Moreover, employees who have been in a rank for some 
time have, on average, lower productivity than those newly promoted 
because some of the latter are working for further promotion. These 
findings are consistent with those of Medoff and Abraham (1980). 
Because of the relationship of the wage to the marginal product dur¬ 
ing the period of promotion, firms make profits at this stage of an 
employee’s life. Those profits cannot be handed back to employees as 
higher wages later on because if at any time the present discounted 
value of future wages promised to an employee is greater than that of 
marginal products, it is in the firm’s interest to fire the employee. This 
is not true, however, if a firm rewards senior employees by providing 
facilities on which its expenditure is independent of its firing deci¬ 
sions. Examples are a luxurious headquarters building or lavish coun¬ 
try club, precisely the types of expenditures that Williamson (1963) 
attributes to managerial firms for which profit is not the only goal. 
The present model thus provides a reason why profit-maximizing 
firms may provide these kinds of benefits. 


II. The Model 

Consider a firm operating in a competitive market with many identi¬ 
cal firms. It may hire employees for any period of time in multiples of 
length y > 0, y being the minimum period (e.g., hour, day, week, or 
year) for which the firm hires. That such a minimum period exists in 
practice seems unquestionable; one simply does not observe continu¬ 
ous reassessment of contracts. To model the reasons for this would 
complicate the ensuing analysis considerably, so here y is taken as 
exogenous. It is not, however, hard to envisage the factors involved in 
the choice of the length of hiring period. Decisions are costly in terms 
of time, continuous monitoring of worker performance is certainly 
more costly in terms of supervision than intermittent monitoring, and 
problems of imprecise measurement of performance may be reduced 
by averaging observations over longer periods. 

We assume that the firm requires at least two employees to produce 
any output (the performance of any one cannot then be deduced 
solely from the firm’s revenue or profits) and that neither the input 
nor the output of an individual employee is verifiable in court. Given 
at least two employees, there are constant returns to scale, with the 
marginal product of an additional employee (measured per unit of 
time) denoted y(p t ), where p, 2 0 is a measure of the employee’s 
performance in the tth period of length y. The function y(p t ) has the 
following conventional properties, with k being a fixed cost of employ¬ 
ment. 
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Assumption 1. The function y(p t ) is twice differentiable, nonde- 
creasing, and concave for p, > 0 , bounded above, with y(0) — - k < 0 
and lim P( _*o y'(pt) > 0 . 

Output is sold at the end of each period, and since y(-) is measured 
per unit of time, the marginal product of employing an additional 
employee for the Ith period of length y is yy{pt), received at the end of 
the period. If at that time the employee is paid a wage yw t , the present 
discounted value of the firm’s profits from employing an extra worker 
from period T on over its infinite time horizon is 

00 

n T = X'MA) - «dp' +1 . (i) 

t~T 

where £ •& 1 is the firm’s discount factor, with exponent t 4- 1 since 
payments are made at the end of the period. 

An employee’s performance depends on both effort and ability, the 
latter denoted by 8 E [ 6 ", 0 + ], with density function strictly positive 
for all 0. Since 0 has no natural units, without loss of generality we 
take 0 ~ > 0. Employees know their own ability but firms do not, so 
new entrants to the labor market are observationally identical to 
firms; differences are those unrevealed after any signaling. Employ¬ 
ees also know their own performance and the employing firm ob¬ 
serves it (with, we assume for simplicity, no error), but nobody else 
does. 

Assumption 2 . Employees live forever. Those with ability 0 £ [ 8 ', 
0 + ] have lifetime expected utility from the beginning of period T 
given by 

VT = Z*v[ u ^) - 

where 8 = exp( — yp) < 1 , p being the constant rate of time prefer¬ 
ence. The function u(w t ) is increasing, twice continuously differ¬ 
entiable, concave, and normalized with a(0) = 0. The function v(p,) is 
twice continuously differentiable and strictly convex with v(0) - 0, 
limp,—« v(’) = *>, v'( 0 ) = 0 , and v\p t ) > 0 for p t > 0 . 

In assumption 2, for notational convenience, utility is measured at 
an average rate per unit of time over the period of length y dis¬ 
counted to the end of the period . 2 Additive separability in income and 


2 It is in the spirit of the model to have effort, p„ expended continuously over the <th 
period of length 7 but the wage, yw„ paid only at the end. To see how this corresponds 
with the form of the utility function in assumption 2 , let «*(») denote the instantaneous 
utility of receiving income i and the instantaneous disutility of performance for 

type 8 , at each moment of time. Then the functions «(«;,) and v(p,) of assumption 2 can 
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effort is standard in models of moral hazard applied to employment 
(see MacDonald [1984] for a discussion of its role), and since the scale 
of 0 is arbitrary, the way ability enters the utility function is less re¬ 
strictive than it might at first sight appear. Both these properties 
greatly simplify the presentation. We denote by u°(0) 2 : 0, assumed to 
be continuous in 0 , the utility per unit of time of an employee with 
ability 6 who is not employed. Firms know the utility function. 

If the performance of each employee were verifiable information 
so that there was no moral hazard, a piece rate contract of the form w, 
= y(p t ) - c, where c is a lump-sum transfer, could be used and then 
employees with ability 0 would choose the efficient performance level 
p*(%) and receive wage u>*( 0 ), defined by 

p*(8) = argmax f u[y(p t )) - ( 2 ) 

w*(8) = y[/>*(0)] - c. (3) 

This does not require 0 to be known by firms, so its unobservability 
does not alone prevent efficient contracting . 3 We assume that the 
maximized value of the maximand in ( 2 ) is at least as great as u°( 0 ) for 
all 0 £ [ 0 “, 0 + ] since agents for whom this does not hold would not be 
employed by any firm even with no moral hazard and no adverse 
selection. Then, by assumptions 1 and 2,p*(B) exists, is unique, and is 
strictly positive for all 0 € [ 0 ~, 0 + ] and p*(0) and w*(Q) are increasing 
in 0. The foregoing provides a useful reference point for what fol¬ 
lows. 


III. Equilibrium Hierarchy 

A firm learns about the abilities of its employees by observing their 
performance. If an employee stays long enough, this process reaches 

be defined as 


u(w,) = 


»*(YW|) 

7 


Hpt) = 




V*{p,) 


t” - I 
P7 


For y and p constant, therefore, assumption 2 covers this case. In the present paper, p is 
always constant. The only place in which 7 is not taken as constant is proposition 2, but 
using the utility function of this note does not affect the results of that proposition (see 
Sec. HI). 

* In other models in the literature, adverse selection is a problem only when either 
the performance of employees is not verifiable so that piece rate contracts are not 
legally enforceable (e.g.. Greenwald 1986) or output is randomly related to ability so 
that employees wish to share the risk with the firm. 
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a limit when the firm has either learned the employee’s ability 
cisely or narrowed down the range to an irreducible minimum. In 
spirit of a backward induction solution in dynamic programming,: 
section considers the equilibrium structure of termination contrr 
that must exist in this limit. The next section considers the proces 
which employees reach their limiting contract. MacLeod and Make 
son (in press, proposition 3) show that all self-enforcing contracts 
equivalent to termination contracts when firms earn zero profits 
equilibrium. For such equilibria, therefore, there is no loss of gene 
ity in assuming that the equilibrium hierarchy consists only of c. 
tracts with wage payments independent of the current period’s c 
put. 

Under a termination contract, the wage paid to an employee 
independent of performance, but the employer can terminate i 
contract next period if performance is unsatisfactory. Thus a ter. 
nation contract is specified by a wage-performance pair ( w, p), witl 
the wage and p the performance level below which the contract 
terminated. As the model is stationary over time, the limiting wa 
performance pair for an employee is also stationary. It may or rr 
not be different for different 8 . Denote by R the (possibly infini 
number of limiting wage-performance pairs, called ranks, ordered 
that rank 1 has the lowest performance level. The contract for ran 
can then be denoted ( w r , p r ), with p r > p r 1 for all r - 1, ...,/£. I 
notational convenience, let rank 0 , with “performance level” p° - 
denote those unemployed. 

Since, by assumption 2, the efficient performance level p*( 0) a 
wage u>*(0) defined by (2) and (3) are increasing in 0, it is natural 
expect (and we will verify later) that vf > w r ~^ and that employ 
with higher ability are assigned to higher ranks. Accordingly, def. 
0 r so that employees with ability 8 6 [ 8 T , 0 r+ *) are assigned to rank 1 
the limit. For notational convenience, define 6 °a 6 ' and [ 0 K , 0 W+ ') 
[0*, 0 + ]. Then H = {u> r , p r , 0 r }?„ 1 denotes a hierarchy of ranks with ( 
p r ) the contract for employees with ability 0 e [0 r , 0 r+l ). Since R 
endogenous, there could, in principle, be a different rank for each 

The limiting employment relationship is a repeated game. F 01 
hierarchy to be an equilibrium, abiding by their contracts must 
optimal for all parties at each date; that is, it must be a best-respoi 
strategy for the subgame starting at t, given the beliefs of the part 
about the employees’ abilities, for all possible information sets. It a. 
makes sense to require that updating of beliefs follows Bayes’s r* 
wherever possible, that is, for events that can actually occur in equil 
rium. Thus for H — {u> r , p r , 0 r }?«1 to be an equilibrium, we requ 
that, for some reasonable belief structure about events that canr 
occur in equilibrium, the following strategies form a perfect Bayesi 
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Nash equilibrium to the repeated game: 4 (A) Firm’s strategy: Offer 
( 2 /, p r ) in period t to employees believed to have ability 0 E [0 r , 0 r+ '), 

for all r *= 1. R and all t. (B) Strategy of employees with ability 0 E 

[ 9 r , 0 r+1 ), r = 1 , ,R: Perform at p r in period t if offered (uf, p r ), 

otherwise quit, all t. 

The firm’s strategy A is a best response to strategy B if and only if 

— y(p r )> r=l . R. (4) 

This is a necessary condition because, given strategy B, if the firm 
believes that employees believed to have ability 0 E [O', 0 r+1 ) will 
perform at p T if offered (w r , p r ), then it believes that they will generate 
less revenue than their wage if w r > y(p T ). It is a sufficient condition 
because if the firm believes that an employee has ability 0 E [0 r , 0 r+ *), 
then, in view of strategy B, it believes that the employee will perform 
at p r if offered (w r , p T ), which, with (4) satisfied, yields nonnegative 
profits, whereas if offered any other contract, he will quit, resulting in 
zero profits. 

Denote by y V r (0) the remaining lifetime expected utility of an em¬ 
ployee of type 0 in rank r who performs below p r . Given strategy A, it 
clearly depends on firms’ beliefs about the ability of such a person, to 
which we return later. Also define 


t/ r (0) 


u(w r ) - M/O/e] 
1 - 8 


f/°(0) * 


1 - 8 ’ 


1. R, 


(5) 

( 6 ) 


where yl/ r (0) is the remaining lifetime utility of an employee with 
ability 0 who remains permanently in rank r, and -yt/°(0) that of re¬ 
maining permanently unemployed. A necessary condition for strat¬ 
egy B to be a best response to strategy A is dearly 

yl/ r (0) > yu(w r ) + 8yV r (0), for 0 G [0 r , 0 r+ ’), r = 1(7) 


* There are many different pairs of strategies that support the same equilibrium 
hierarchy. Closer to the strategies used in the next section, and perhaps to those 
observed in practice, are the following: Firm's strategy: (1) At the beginning of period /, 
offer the contract ( w r , p r ) to any worker believed to have ability 8 6 [8’, 8 r ’') and not 
previously fired. (2) At the end of period l, fire all employees in rank r who performed 

below p”. Strategy of employee with ability 8 6 [8 r , 8'* ), r = 1. R: (1) At the 

beginning of period l, accept the contract giving the highest utility of all those on offer. 
Denote this (at 1 , p 1 ). If currently employed in rank s’ and («>*, p') is offered by two or 
more firms, including the current employer, accept the current employer’s offer if and 
only if s a s'. (2) If s s r, perform atp‘\ if s > r, perform at zero. As will be clear from 
proposition 1 below, these strategies give rise to the same equilibrium hierarchy as 
strategies A and B. These are adopted in the text because they make the exposition 
simpler. 
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To see this, note that the right-hand side is the remaining lifetime 
expected utility, discounted to the end of the current period, of doing 
no work (i.e., setting p = 0 , which gives zero disutility of work), 
collecting wage w for the current period, but having the contract 
terminated for the subsequent period. 

Kreps and Wilson (1982) have argued that the structure of beliefs is 
an essential part of the specification of equilibrium in models of this 
sort, but beliefs about events that do not occur in equilibrium are 
problematic. They are crucial for supporting the equilibrium, but 
since they never actually occur, there seems no uniquely appealing 
general procedure to impose on the way rational agents form them. 
In these circumstances, a natural procedure for economic modeling is 
to adopt a belief structure that, for the particular problem, seems 
plausible from an economic point of view. 

To illustrate the importance of beliefs, suppose that firms believe 
that employees in rank r who perform below p T are simply shirking 
and in fact have ability 0 € [0 r , 0 r+ ’). Then, as in Shapiro and Stiglitz 
(1984), in which there is no adverse selection, poor performance does 
not result in any revision of beliefs about ability. If, as under strategy 
A, firms continue to offer the contract (w r , p r ) despite performance 
below p\ condition (7) would not be satisfied since then VT0) a f/ r ( 0 ). 
In Shapiro and Stiglitz an equilibrium with positive output exists only 
because shirking employees are fired and the existence of involuntary 
unemployment ensures that they cannot guarantee to find another 
job immediately. 

When, as in the present model, firms do not know a priori the 
ability of each employee, it seems more plausible from an economic 
point of view for them to admit the possibility that a mistake was made 
in assigning the employee to a rank. Since we are here concerned with 
the limiting hierarchy in which firms have found out all they are 
going to about the ability of employees, it would seem natural for 
them to assume that any mistake was small; that is, the employee had 
been placed one rank too high. Thus we adopt the following belief 
structure: Firm’s beliefs: An employee in rank r with performance 
p a p T in the previous period has ability 0 € [ 0 r , 0 r+ '), r = l,... ,R, 
with probability 1. An employee in rank r with performance p<p T in 
the previous period has ability 0 £ [0 r “', 0 r ) with probability l . 5 These 
beliefs satisfy Bayes’s rule for events that can happen in equilibrium 
since in equilibrium employees with ability 0 £ [ 0 r , 0 r+ ’) in rank r 

* A hierarchy similar in spirit, but different in detail, would arise if firms believed that 
such employees had ability 8 6 [0'“ *, 8 r ) with probability « < 1 and ability 0 6 [0 r , 0 r M 
with probability 1 - a, and assigned them to ranks r - 1 and r with probabilities a and 
1 - a, respectively. 
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perform at p T . Moreover, with these beliefs, strategy B is a best re¬ 
sponse to strategy A. To see this, first note that, given the firm’s 
beliefs, it is never a best response for an employee with ability 8 € [6 r , 
8 r+ ‘) to choose performance p > p T or 0 < p < p r . Second, there is no 
better response for an employee with 8 6 [0 r , 0 r+ *) than to quit if 
offered a contract other than (w r , p r ) since a potential employer can 
infer the rank offered from the wage offered and would always be 
prepared to match the contract offered by the present employer. 
Thus ( 7 ) is sufficient for strategy B to be a best response. Under the 
firm’s beliefs, the default utility of an employee in rank r who per¬ 
forms below p T is 

T r (8) = {/'-‘(O), r = 1 ,. . . , R, 8 £ [6“, 8 + ], (8) 

since, given the beliefs, any employer would offer the contract ( w r ~ \ 
p T ~ ‘) but none (w r , p r ). Those in rank 1 who perform below p l are 
never rehired and receive utility U°( 8). Thus a hierarchy of ranks 
satisfying ( 4 ), ( 7 ), and (8) constitutes a perfect Bayesian Nash equilib¬ 
rium supported by strategies A and B and the firm’s beliefs. 6 

To be a market equilibrium a hierarchy must also satisfy the free- 
entry condition that no firm can profitably offer another firm’s em¬ 
ployees contracts that both are incentive compatible and give strictly 
greater lifetime utility; hence the following definition. 

Definition. H = {w T ,p r , 0 r }?_ ,, with p r+ ^ > p r ,r - 0,... ,R - 1, is 
an “equilibrium hierarchy” if it satisfies the incentive-compatibility 
conditions (4), (7), and ( 8 ) and the free-entry condition that if there is 
a rank r - 0 ,.... ft, a 0 £ [ 8 r , 0 r+ '), and a contract (to, p) with both w s£ 
yip) and 

"W- . NfW a «„) + 

l—o 

then {u(zo) - [v(p)/ 0 ]}/(l - 8) £ U\ 8). 

For what follows it is useful to define 

p°(8) * argmax |w[ yip)] - ~-J, 

By assumptions 1 and 2 , the maximand in ( 10 ) is twice differentiable 
and strictly concave. Since it has positive slope as p 0 and a limit of 
“® as p-* & for all 0 £ [8”, 8 + J, p°(0) is unique, positive, finite, and 
differentiable for all 8 6 [0 _ , 0 + ]. 


8 U’(8), (9) 


8er,8 + l- 0 °) 


b For a formal proof that the equilibrium hierarchy of proposition 1 Mow forms a 
perfect Bayesian Nash equilibrium, see the appendix of the original version of this 
paper (MacLeod and Malcomson 1985). 
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Assumption 3. There exists 0' E (8~, 0 + ) such that 


tW«ni} - —- «°(0'). 

aiie e [6 - , e + j. 


6*6 


86 


( 11 ) 

( 12 ) 


Condition (11) ensures that there is some 0£[8‘,0 + j for which the 
incentive-compatibility conditions can be satisfied so that some em¬ 
ployment can occur but that this is not true for the very lowest ability. 
It ensures that there is some outcome worse than being in rank 1 so 
that an incentive-compatible contract for that rank exists. Condition 
(12) is a regularity condition on the default utility. It ensures that the 
utility of being unemployed does not rise so fast with 0 that higher- 
ability employees remain unemployed when lower-ability employees 
have jobs. 

Proposition 1. Under assumptions 1-3, there exists a unique equi¬ 
librium hierarchy. It has R finite and satisfies 


w’ = y(p r ), for r = 1, . . . , R, 

U r (8 r ) = u(w’) + 6L"-'(0 r ). for r = 1, . . . , R, 


p r - arjgna* |&u[y(0)j - 


v(p) 

0 r 


for r = I, . . . , R. 


(13) 

(14) 

(15) 


The formal proof of this result, being rather long, is in the Appen¬ 
dix, but the intuition behind it is straightforward. Condition (13) 
requires that the wage in each rank equals the marginal product in 
that rank so that firms earn no profits from the limiting contracts. 
Clearly, a firm would never retain an employee with a wage always 
greater than the marginal product. Equally, if it made a strictly posi¬ 
tive profit from an employee, a competing firm, knowing that em¬ 
ployee’s rank, would offer a higher wage for the same performance 
provided that the employee would not then shirk. But it is clear from 
(5) and (7) that a higher wage never induces an employee to shirk, so 
the wage will never be less than marginal product either. Condition 
(14) requires that the ranks be constructed so that those employees 
with the lowest ability in a rank are indifferent between, on the one 
hand, performing just well enough to stay there and, on the other, 
shirking for one period and, as a result, being rehired by another firm 
in the rank below. Because the marginal disutility of performance 
decreases with ability, l/ T (d) - bU r ~ '(0) is increasing in 0, so higher- 
ability employees in that rank will not shirk either. Conditions (14) 
and (15) jointly determine p r and 0 r . Straightforward manipulation 
with the use of (5) shows that (15) is equivalent to requiring that p T 
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maximize U r (6 r ) - u(w r ) when (13) holds. It follows that 0 r is as close 
to 1 as is consistent with (14) being satisfied for rank r, thus 
minimizing the range of ability in each rank. This is as one might 
expect. If performance were verifiable, the efficient outcome given in 
(2) and (3) would have the wage and performance continuous in 0, in 
effect a continuum of ranks. Minimizing the range of ability in each 
rank is what gets closest to that outcome. What limits how close the 
ranks can be is that an employee in rank r who shirks is believed to 
have ability appropriate to rank r - 1, so to prevent shirking, the 
utility in rank r - 1 must be lower than that in rank r by the discrete 
amount of the reduction in the disutility of effort that results from 
shirking, discounted appropriately because the drop to rank r - 1 
occurs in the period after shirking takes place. This difference in 
utility is the cost to the loss of reputation that results from shirking. 
That cost removes the need for the involuntary unemployment that is 
required to prevent shirking in Shapiro and Stiglitz (1984). The dis¬ 
crete utility difference between ranks ensures that there can be only a 
finite number of ranks even though ability is continuous. The gain 
from shirking, of course, depends on how long it is before the penalty 
of falling to a lower rank is imposed, that is, on the length of the 
hiring period y. Proposition 2 shows how this affects the rank struc¬ 
ture. 

Proposition 2. Under assumptions 1-3, for the equilibrium hi¬ 
erarchy 

lim p r = p r ~ l , (16) 

7—0 

lim 0 r = 0 r ~(17) 
lim p°(6 ) = p*(6 ), all 0 E [0~, 0 + J. (18) 

7—0 

Proof. As -y -+ 0, the time preference factor 8 -*■ 1 necessarily. 
Equation (16) then follows from (14) and (5), (17) from (15) and the 
continuity of p r in 0 r , and (18) from the definitions of p°(0) in (10) and 
m in (2). 7 Q.E.D. 

As y gets shorter, the drop in utility required to prevent unsatisfac¬ 
tory performance gets smaller, the ranks in the hierarchy thus be¬ 
come closer together, and more low-ability employees can be hired. 
The performance for each 0 approaches p*(0), the level that would be 
established if performance were verifiable information. 

7 For the form of utility function in n. 2, note that taking limits of u(w) and v(p) as 7 -* 
0 and inserting these into (14) and (15) leaves the substance of the proof unaltered. 
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IV. Promotion 
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The analysis of the preceding section refers only to the equilibrium 
hierarchy, that is, to those employees who have been with the firm 
long enough to reach the limiting rank appropriate to their ability. As 
has already been shown, if performance were verifiable information, 
new entrants to the labor market could be paid under a piece rate 
contract and left to choose their own performance level. With the 
equilibrium hierarchy, however, they cannot be hired directly into the 
appropriate rank because of the adverse selection problem. The firm 
cannot place them in the appropriate rank immediately since it does 
not know their ability. Moreover, employees will not select the appro¬ 
priate rank themselves since, under termination contracts, they would 
all choose rank R, which has the highest wage. Those with ability 6 < 
6* would then shirk. The firm would thus make a loss for all 0 < 6” 
and, since w R = y(p H ), zero profit for 0 2 0*, resulting in a loss overall. 

Since any employee assigned to too high a rank will always shirk 
and thus impose a loss on the firm, the firm can avoid losses regard¬ 
less of the distribution of 0 only if it uses a selection mechanism that 
ensures that no employee selects too high a rank. The same problem 
does not, however, arise if an employee selects too low a rank initially 
and then subsequently seeks promotion to the appropriate rank. We 
therefore seek a selection scheme that is self-selecting in the following 
sense. 

Definition. A selection scheme is self-selecting from rank r, r = 0, 

... ,R - 1, if, when it permits promotion from rank r to rank s > r, an 
employee in rank r with 0 € [0 T , 0 T +1 ), t 2 r, selects rank s only if t 2 .5 
and eventually reaches rank t. 

It is clear that a scheme will be self-selecting only if it imposes on 
employees who choose too high a rank a cost that outweighs the 
higher wage. There are two ways to impose such costs in the present 
model. One is for employees to pay a fee or bond to be placed in a 
given rank. We call this selection by bonding. The other is for employ¬ 
ees to be required to work hard in a lower rank for promotion to a 
higher one. We call this selection by performance. We discuss each in 
turn. 

For any scheme to be self-selecting from rank r - 1, it must be 
possible for employees with ability 0 £ [0 r , 0 r +1 ) to move to rank r since 
otherwise they could not get there without shirking. Let b r denote the 
bond required for this. Then necessary conditions for a scheme to be 
self-selecting by bonding from rank r - 1 are 

U r ~\9) s u(w T - b r ) - ■ V ^P- + 8l/ r (0), for 0 e [0 r , 0 r+J ), (19) 

U 

tT~ l (0) 2 u(w r - b T ) - V tp- + 8t/ r (0), for 0 € [0 r_ l , 0 r ). (20) 

0 



REPUTATION and hierarchy 845 

Conditions (19) and (20) ensure that those with 0 e [0 r , 0 r+, ) get 
higher utility and those with 0 E [0 r ~ \ 0 r ) lower utility, by paying the 
bond, moving to rank r, and performing well enough to stay there 
than they do by staying in rank r — 1. However, the scheme must also 
be proof against the latter’s paying the bond, moving to rank r, shirk¬ 
ing, and going back to rank r - 1. (The former will not do this by the 
construction of the equilibrium hierarchy.) This requires 

U r ~ l (6) a u (w r - b r ) + 8i/ r -‘(0), for 0 e [0 r_I , 0 r ). (21) 

With r — 1 = 0, these conditions apply also to those without a job. 

Proposition 3. Under assumptions 1-3, there exists no selection 
scheme that is self-selecting by bonding from rank r for any r = 1, 

.. . , R - 1, nor for r = 0 if u°(0) is strictly increasing in 0. 

Proof. Since U r (B) is continuous in 0, (19) and (20) require 

U r ~\6 T ) = u(w r - b T ) - + hU T (B r ). (22) 

0 

Since u(w) is monotonic, b r is uniquely determined. Since from (5) or 

( 6 ) 

U\B) = u(w r ) - V(i - P - + 8£/ r (0), all 0 £ [0~, 0 + ), (23) 

0 

(22) can be written as 

t/ r ~‘(0 r ) = u(w r - b T ) + IT{V) - u(w r ), 

and, with the use of (14), becomes 

U r ~ l (0 r ) = u(w r - b r ) + 8U r " 1 (e r ). (24) 

But it follows from (5) that, for r 2: 2, U r ~ l (B) is increasing in 0, so 
since 8 < 1, (24) implies that (21) is violated. Hence, no self-selecting 
scheme exists from any rank r 2 1. Moreover, by (6), m°( 0) increasing 
in 0 implies U°(0) increasing in 0, so then (24) implies that no self- 
selecting scheme exists from rank 0. Q.E.D. 

Because higher 0 involves a lower disutility of performing any given 
task, one would expect u°(0) to be increasing in 0 if the alternative to 
employment is self-employment or working in the home. Then self¬ 
selection by bonding is impossible. The economics of this is straight¬ 
forward. To induce employees with 0 a 0 r , but not those with 0 < 0 r , 
to self-select into rank r, the bond for promotion to rank r must be 
such that those with 0 = 0 r are just indifferent between, on the one 
hand, paying to be promoted and then performing at the level neces¬ 
sary to stay in rank r and, on the other hand, staying in rank r - 1. 
That is the content of (22). But in the equilibrium hierarchy, employ¬ 
ees with ability 0 r are indifferent between performing well enough to 
stay in rank r and shirking, getting fired, and going to rank r - 1 in 
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another firm. Thus they are also indifferent between staying in rank 
r — 1 and paying for promotion to rank r, then shirking. That is the 
content of (24). Because of their higher disutility of performance, 
those with 8 < 0 r get lower one-period utility than 0’ do from per¬ 
forming at the level to stay in rank r - 1. But because shirking 
involves no disutility of performance, they get the same one-period 
utility as 0 r from paying for promotion and then shirking, so their 
utility gain from doing that is greater than it is for 0 r . Hence, a bond 
that will induce 0 r to select rank r will also induce 8 < 0 T to do so. 

An alternative to bonding is for a firm to assign employees to ranks 
on the basis of performance. For this, new entrants to the labor mar¬ 
ket must initially be hired into what will be called rank i with wage vf 
and offered promotion on the basis of performance in that rank. But 
since only the firm and the employee observe performance, promises 
to promote must be made incentive compatible. To see the problem 
this poses, note that a firm must always pay w' to employees promoted 
to rank r since otherwise another firm, knowing them to have ability 0 
a 0 r , would attract them away with higher offers. But with w r = y(p r ), 
a firm makes no profits from employees in rank r who perform at p r , 
which, under strategy B, is what those with ability 0 G [0 r , 0 r+ ’) do. For 
self-selection, the performance criteria for promotion above rank r 
must clearly be greater than p r since otherwise employees who should 
stay in rank r of the equilibrium hierarchy, and perform at p r to do so, 
will be promoted. Thus the firm earns profits from an employee 
working hard enough for promotion but none once that employee 
has reached the appropriate rank of the equilibrium hierarchy. 
Therefore, even if an employee has worked hard enough to meet the 
promotion criteria for some higher rank, ex post the firm has an 
incentive to refuse promotion and tell the employee to try again the 
following period. If the employee actually did that, the firm would 
increase its profits by this strategy. 

We shall, however, show that there exists a scheme for selection by 
performance that involves promotion of employees one rank at a 
time, is incentive compatible for the firm, and is self-selecting. We call 
this a promotion scheme. It turns out that the initial rank i has the 
same performance criterion for moving to rank 1 as for an employee 
in rank 1 to stay there, namely p\ although the wage may be different. 
Thus in performance terms, rank i is equivalent to rank 1. Let p n be 
the performance required for promotion from rank r to rank r + 1 • 
As explained above, self-selection requires p" > p r . Then, for a pro¬ 
motion scheme to be both incentive compatible for the firm and 
seif-selecting requires the following strategies to be best responses: 
(C) Firm’s strategy for an employee with performance/?: If in rank i, 
fire if p < />', move to rank 1 if p' £ p < /?", and promote to rank 2 if 
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p 2 p n \ if in rank r, r * 1,- 1, promote to rankr + 1 \ip 2» p' r \ 
otherwise, follow strategy A. (D) Strategy of employee with ability 6: 
If in rank i, set performance p * p" if 0 2s 8 2 and p = p l if 8 1 2 s 8 < 8 2 ; 
otherwise, p — 0; if in rank r, r = 1,..., R, set performance p - p' T if 
fl 2 8 r+ 1 and if firm promotes stay, if not quit; otherwise, follow 
strategy B. 

Clearly, strategy C is a best response to strategy D. Not promoting 
an employee in rank r who performs at p' r causes that employee to 
quit, and thus zero future profits are earned from him. Since such an 
employee will perform at least at p r+1 in rank r 4- 1 and may perform 
above p r+ * for further promotion, the firm’s future profits are cer¬ 
tainly not decreased, and may be increased, by promoting the em¬ 
ployee. For strategy D to be a best response, employees in rank r with 
ability 0 £ [0 J , 0* + '), s > r, must get greater utility from working at the 
performance levels required for promotion to each successive rank 
up to s and then stay there, whereas those with ability 0 £ [0 r , 6’ + ') 
must prefer to stay in rank r. With the definition 

£/‘(0) * u(w') - + 8f/‘(0), 0 £ [0-, 0 + ], (25) 


and the conventions i + n ® 1 4- n for n > 0, s - (s — i) » 1, and 0 ! * 
0 l , this can be written as 


U r (0) JS £ 8"- r [ M ( w ") - 


+ 8 ,r C/ s (0), 


(26) 


for 0 E 10', 0 i + *), j = r 4- 


1, r = i, 1, . . . , R - 1; 


v- 1 


U\0) > 8"~ r [«(«'") - -^p-1 + 8’~’L P (0), 

f 1 *=* r L J 

for 0 £ f0 r , 8 r,1 ),J = r+ 1, r = i, 1,.... R 


(27) 

1. 


In addition, the scheme must be proof against employees in rank r 
with ability 8 £ [8 r , 0' + ') working for promotion to rank s > r, then 
shirking in successive ranks until they get down to rank 5 - m for any 
m. This requirement can be written as 

m -^1 

£ 8' , u(u 1 ’ ~") + 8 m +1 <_/' ’"(6), 
n *» 0 


U T (%) 2 


n L 


u(w n ) 


_ JW. 

0 




4- 8 


for 0 £ [0 r , 0 r+1 


), m - 0, . . . , s — r; s = r + 1. R: 

r = i, 1, . . . , R - 1. 


(28) 


Definition. A promotion scheme associated with the equilibrium 
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hierarchy {w T , p T , O’}?, j consists of all employees being hired initially 
into rank i, and a set of {p"}?J j 1 , a p\ and a p'‘ that satisfy (26)-(28). 

Proposition 4. Under assumptions 1-3, there exists a unique pro¬ 
motion scheme associated with the equilibrium hierarchy. It has p’ = 
p 1 j p" — p'\ and p' T > p r+ l ,r = 11, and satisfies 

«<«0 - + &i/’ + 1 (e r+ ‘> = u r ( e r+1 ), r = 1. r - 1. 

0 

(29) 

The proof of this result is given in the Appendix. It is, however, 
intuitively clear why self-selection by performance works although 
self-selection by bonding does not. Like (22), condition (29) requires 
that employees with ability 6 = 0 r+ 1 be indifferent between staying in 
rank r and promotion to r + 1. But because the disutility of effort 
decreases with ability, performing well enough for promotion is rela¬ 
tively less costly in utility terms for those with higher ability, whereas 
paying for promotion is not. This is sufficient to induce employees 
with 0 < 0 r+1 not to work for promotion. 

Since p” > p’ and the wage of those in rank r 2= 1 is y(p r ), firms make 
positive profits from employees who are working to get promoted. 
This has two important consequences. First, because a firm makes no 
profits from employees with ability 0 £ [0', 0 r_ ') in rank r, it would 
increase its profits at their expense if it were to demote them and get 
them to work for promotion again. Quitting rather than accept demo¬ 
tion is the optimal strategy for employees to adopt in order to prevent 
this. It denies the firm any profits from demotion and, since other 
firms will offer as much as is obtained after demotion, is costless to 
employees. But then that strategy is also optimal for employees de¬ 
moted for shirking because it is in their interest to mimic the response 
of nonshirking employees. This provides a theoretical rationale for 
the labor market convention that employees do not accept demotion. 

Second, competition cannot induce firms to pay back the profits 
made during promotion in the form of higher wages to any employ¬ 
ees other than those newly hired in rank i. The wages for all other 
ranks are completely constrained by the incentive-compatibility con¬ 
ditions. The wage xv 1 is not: it cancels out of (26)-(28). If the firm sets 
uf > mineefe-.fli) «°(®)» however, it will attract some employees with 
ability 0 6 [0~, 0 1 ) who will shirk and on whom it thus makes a loss in 
the first period before they are weeded out. An alternative strategy 
the firm can adopt to compete for more able employees is to commit 
itself to using its profits to provide benefits for employees who stay 
more than one period, but in such a way that incentive compatibility 
of the promotion scheme and the equilibrium hierarchy are not up- 
set. To do this the total expenditure by the firm must be independent 



REPUTATION and hierarchy 849 

of how many employees it retains and what rank they reach. If appro¬ 
priate ways of making the necessary commitments can be found, com¬ 
petition will induce the firm to tailor the benefits for employees pro¬ 
moted only as far as rank r to the profits they themselves generate. 

How can a firm commit itself in this way? One possibility is to set up 
a benevolent fund for employees to which the firm makes payments 
independent of the number of employees retained and what rank 
they achieve. Others are expenditures on a luxurious headquarters 
building or a country club that benefit particularly employees who 
reach the higher ranks in the hierarchy. Whatever the form, it is 
crucial that the expenditure incurred by the firm be independent of 
how many employees reach each rank since otherwise it would affect 
the firm’s promotion decisions. This suggests a reason for firms to 
reward employees, particularly senior ones, with such benefits in 
kind. Williamson (1963) attributes such expenditures to control by 
managers for whom profit maximization is not the only goal. The 
analysis here suggests that there may be good reasons, quite apart 
from differences in the tax rate on income and on benefits in kind, for 
profit-maximizing firms also to provide these kinds of benefits. 

The promotion structure derived here has many of the characteris¬ 
tics of the promotion structures actually observed. Promotion step by 
step of employees who perform above average in lower ranks seems 
to be the norm in practice. In the model, employees rising through 
the hierarchy are paid a wage less than their marginal product, 
whereas when they reach their limiting rank, their wage equals their 
marginal product. Moreover, a very low wage may be paid in the first 
period to mitigate the effects of adverse selection. Hence, for most 
employees, the ratio of their wage to their marginal product rises 
through their lifetime. In addition, since some of the employees 
newly promoted to a rank are working for further promotion, their 
performance is, on average, better than that of those who have been 
in it a long time. Both these are characteristics observed by Medoff 
and Abraham (1980). Finally, since all newly hired employees start at 
the same wage but some get promoted, the variance of earnings will 
increase with age, as noted by Mincer (1974). 

V. Concluding Remarks 

Termination contracts are two-party contracts between the employer 
and each individual employee. There is, however, a growing litera¬ 
ture arguing that contracts introducing more parties can overcome 
the type of moral hazard analyzed here. Malcomson (1984, 1986) has 
argued this for rank-order tournaments, and Holmstrom (1982) for 
contracts that introduce a passive third party to break the require- 
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ment that the shares of total revenue accruing to the active parties 
sum to one. In view of this, why should employment ever be based on 
two-party contracts? 8 

One important reason is that termination contracts can be made 
entirely self-enforcing, with no need for legal enforceability. In the 
present model, the only legally enforceable element is the wage to be 
paid at the end of the period. Since, however, the firm must pay this 
wage regardless of performance, the same outcome would be 
achieved if it were paid when the employee started work. There 
would then be no need for a written contract or for reliance on the 
legal system for enforceability. 

A second reason for using two-party contracts is that contracts in¬ 
volving more parties give scope for collusion and corruption in ways 
that two-party contracts do not. In the spirit of Holmstrom (1982), 
introducing a third party to whom employees must make a payment 
in the event of a separation can overcome the moral hazard problem 
analyzed here. But as Eswaran and Kotwal (1984) have shown for 
other contracts of this type, there then always exists a Nash equilib¬ 
rium agreement by which the third party bribes the firm to induce a 
separation—the firm gaining the bribe and the third party the separa¬ 
tion payment minus the bribe—thus destroying the incentive com¬ 
patibility of the contract. Indeed, MacLeod (1987) has shown that 
implementing efficient outcomes in dominant strategies in such mod¬ 
els necessarily requires individual monitoring of workers. Rank-order 
tournaments overcome the moral hazard problem by making the total 
wage bill independent of what the firm reports each employee’s per¬ 
formance to be so that it has no incentive to misreport. But there still 
remains the possibility that an employee will bribe the firm to report 
that he or she performed best or, indeed, that the firm will demand a 
bribe not to report an employee as the worst. Tournaments are also 
open to favoritism. Stoft (1985) has elaborated on these points. We 
conjecture that such problems could be overcome if corruption even¬ 
tually became apparent so that the firm could make a long-term gain 
from honesty, but it is not clear whether the efficiency loss from that 
would be less than that from the two-party contracts used here. 

Other markets in which both moral hazard and adverse selection 
pose serious problems are credit markets and insurance markets. In 

s One might also ask what the effect would be of including capital market* in the 
model. Since most employees have a lower income at the start of their career than later 
on, for u(u>) strictly concave they would wish to borrow on the capital market to smooth 
consumption. This is not the place for a full treatment of this issue, but there are clearly 
difficulties with such borrowing. Any contract contingent on future rank would be 
subject to severe moral hazard, and because of adverse selection, even noncontingent 
contracts would have the problem that lenders would not know any particular individ¬ 
ual’s capacity to repay. 
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these, such institutional features as credit rating agencies and “no- 
claims” discounts for automobile insurance provide means by which 
unsatisfactory performance results in costly loss of reputation in a way 
similar to the employment hierarchy of the present paper. The exis¬ 
tence of such institutional features suggests that the basic ideas ex¬ 
pressed here may also apply to these markets. 

Appendix 

Proof of Proposition l 

The proof is by induction, first showing that rank 1 exists, then that rank r is 
uniquely determined given rank r — 1, and so on up to rank R, R being 
determined endogenously. Define x(0, 0') * u{y[p°(0)]} - {v[p°(0)]/0'8}. From 
the definition of p°(0) in (10) and assumption 2, the partial derivative X)(0, 0) 
= 0, for 0 £ [0~, 0 + ], and X2<0, 0') > 0, for 0,0' 6 [0", 8 + ]. By assumption 3, 
there exists a unique 0 l satisfying x(0', 0 1 ) = u°(0‘). We show that rank 1 is 
uniquely defined by 8 1 } = {ylp 0 !® 1 )],^ 0 !# 1 ). 0 1 }. Clearly, w 1 = y[p 0 (8')] 

satisfies (4) and implies (13) for r = 1. This, with (5) and the firm’s belief 
structure, allows (7) for r = 1 to be written as 

u[ y( p l) ] - - u°(0) a 0, all 8 E [B 1 , 0 2 ). (AI) 

Rank 1 as specified satisfies this, and thus (7), with equality for 0 = 8 l , 
implying (14). Moreover, by assumption 3, (Al) is satisfied for all 0 £ [0\ 0 2 ), 
implying that (7) is satisfied. Rank 1 as specified also satisfies the free-entry 
condition for r - 0 because, with p 1 chosen as in (15), there can exist no 
contract (ui, p) satisfying (9) for 0 € [0~, 0 1 ), so no firm can profitably offer a 
contract between those for rank 0 and rank 1 that gives greater utility than 
l/°(8) for 8 £ [8", 8 1 ). To show that rank 1 is unique, suppose that (u> , p u , 
8 1 ’) could also be rank 1. From the discussion above it is clear that 0 1 ' > 0*. 
But then the free-entry condition is not satisfied for this alternative rank since 
8 1 E [0 - , 8*') and an employee with this ability would be made strictly better 
off with the contract (u/\ p 1 ) than in rank 0. Thus we have shown that rank 1 
exists, is uniquely defined, and satisfies (13) —<15). 

Given a unique rank r satisfying (13)-(I5), we now show either that rank r 
+ l is uniquely defined and also satisfies (13)—(15) or that R = r. By assump¬ 
tion 2 and the definition of p°(0) in (10), we have 

0{x(8, 8) + MpQ/e]} > 0 (A2) 

38 

*(9 r . e r ) + = u[y(p r )] + (l - ^p- < U{y{p r ) J. (A3) 

Suppose that there exists a 8 r+1 £ [0 r , 0 + ] satisfying 

*<0 r+ \ 0 r+ ') + = u{y(p r )l (A4) 

Then, from (A2) and (AS), it must be unique and satisfy 8 r+1 > 8 r . We now 
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show that rank r + 1 is uniquely defined by {y[^°(0 ,+ 1 )] t p°(0 r+1 ), 0'" f ‘}. 
Clearly, u» ,+1 = y[/*°(0 r+ *)] satisfies (4) and implies (13). From (A4), it follows 
that (7), with (8) inserted, holds with equality when 0 = 0 ,+ \ implying (14). 
Given the definition of p T , there exists no contract (u>, p) satisfying the free* 
entry conditions w rz y(p) and ( 9 ) for 0 € [0 r , 0 r+ '). Hence, no equilibrium 
rank can be formed between (w r , p f , 0') and (w r+l ,p T+l , 0 ,+ ’). To demonstrate 
uniqueness of rank r + 1, one need only follow the argument for rank 1. 

Thus if a solution to (A4) exists with 0 £ [0 T , 0 + ), rank r + 1 is uniquely 
defined and satisfies (13)—(15). If no solution exists, it follows from (A2) that 
the left-hand side of (A4) is strictly less than the right-hand side for 0 £ [0 r , 
0 + ] and, hence, there cannot exist a contract (w, p) satisfying w :£ y(p) and (9) 
of the free-entry conditions. Then R = r. Finally, assumption 2 and (A3) will 
ensure the existence of an c > 0 such that 0 r+1 a 0 r + e for all r defined in the 
induction process. This implies that R exists and is finite. Q.E.D. 


Proof of Proposition 4 

With t - 1. R - 1 and s = r + 1, (26) and (27) imply (29), as required. 

With the use of (23), (29) can be written as 

„(«,') - + si7 ,+l (0 r+1 ) = U {w r ) - + 8tr(e r+1 ), 

0 r 0 r 

r- I.- Is 

which, with the use of (14) to substitute for l/ r (0 r+ ‘), can be written as 

+ 8C/ r+ '(0 r+l ) = . -T - ffi) . + (/ r+ *(0 r+ *) - u{tf+ l ), 

or, with the use of (23) and rearrangement, 

v(p ,r ) - V(p r +1 ) = V(p r ), r = 1. R. (A5) 

Since p r +1 > p r > 0, r — 1,..., R - 1, and v'(p) > 0 iorp > 0 by assumption 
2, it follows that p” is uniquely determined with p' T > p T+ ' for r - 1 ,..., R - 
1, as stated in the proposition. By a similar argument, (26) and (27) for r = i 
ensure 

u{w l ) - + 6(/ 2 (0 2 ) = u(u/) - + 8(/’(0 2 ), (A6) 

0 2 0* 

u(ui') - + Bt/^e 1 ) = u(ui’) + BC/^© 1 ). (A7) 

0 

In view of (23) and (14) for r = I, (A6) implies/*' = p l , and (29) for r = 1 and 
(A7), together with (14), imply p" = p'\ as required. 

Condition (29) ensures that (26) with s = r + 1 holds with equality for 0 = 
0 T+1 . We now show that it holds for all 0 a 0 r+1 and that (27) with s = r + I 
holds for all 0 < 0 r+ ‘. Note that, given (23), the inequality in (26) with s * r + 

1 can be written as 

8[ir +, (0) - U r m - SO, r = i, 1,. . . ,R - l <A8> 
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But from the definition (5), or (25) for r = », and the fact thatp r+1 > p r and p" 
> p r , it follows that the left-hand side of this is increasing in 6 , so (A 8 ) is 
satisfied for all 8 a: 0 r+1 and violated for all 0 < 0 r+ Sequential application of 
this establishes that (26) and (27) are satisfied for all s = r + 1, .. . , R. 

It remains to show that (28) is satisfied. Use of (A5) to substitute for 
lf + *( 0 r+ *) in (29) gives 

u(tv r ) - -^£1 + &u(w r+l ) + 8 2 t/’(0 r+ ') = l/'(0 r+t ), r = 1. R - 1, 

(A9) 

so that the weak inequality in (28) holds with equality for 0 = 0 r+ \ s = r + 1 , r 

= i, 1. R - 1. But if (23) is used to substitute for U T (d) on the left-hand 

side of (28) and then to substitute for 8 (/ r ( 0 ), (28) with s = r + 1 can be 
written as 

1 L t JMKL .. a *) - u (u,')i 

, (A 10) 

ee [0 r , 0 r+l ), r = i,l. R - 1. 

We know from (A9) that this holds with equality for 0 = 0 r+ *, and since the 
right-hand side is strictly positive, it follows that [v(p' r ) - (1 + 6 )v(p r )] is also 
strictly positive. Hence, the left-hand side is strictly decreasing in 0 , and thus 

the inequality in (A10) holds for all 0 < 0 r+ r = 1. R — 1. Sincep' = p 1 

andp" = p , this also holds for r = i. Sequential application of this result to 

successive ranks establishes that (28) holds for all s = r + 1. R and m = 0, 

..., s - r. Q.E.D. 
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Estimates of the Returns to Quality 
and Coauthorship in Economic Academia 


Raymond D. Sauer 

Clmson University 


Salaries of academic economists are studied to determine if individ¬ 
uals receive differential returns to publishing articles of varying 
quality and to coauthored versus single-authored articles. Estimates 
based on detailed data and a flexible nonlinear least-squares proce¬ 
dure indicate that substantial returns to quality exist and that an 
individual's return from a coauthored paper with n authors is ap¬ 
proximately 1/n times that of a single-authored paper. 


It is enough to check the growth of science that efforts 
and labors in this field go unrewarded. [Francis Bacon; 
quoted in Merton (1973, p. 297)] 

I. Introduction 

The reward structure in academia has been a subject of keen interest 
to economists, sociologists, and historians of science. Two topics in 
this inquiry are of particular interest to economists; (1) the existence 
of incentives capable of promoting the growth of knowledge, in which 
economists have tended to specialize; and (2) the consequences of 
competition among scientists within the structure of rewards (see esp. 
Merton 1973). The chief concern of this study is a more careful docu¬ 
mentation of the incentives academicians face. 
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Ashenfelter, Bob Halvorsen, Mason Gerety, Tim Sass, and an anonymous referee 
helped to improve this paper. 
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This paper presents an empirical analysis of the earnings function 
for academic economists that is designed to address two questions. 
First and more important, what are the returns to quality in academic 
research? Second, what are the relative returns to single authorship 
versus coauthorship of published research? These questions have 
been neglected in previous investigations. 

Earnings functions similar to the one used in this paper have been 
estimated in several studies for various academic disciplines. The 
most extensive of these is a study by Tuckman (1976, chap. 5), which 
examined rewards across disciplines for over 50,000 faculty at 301 
institutions. Tuckman measured systematic and significant returns to 
publishing for 17 of 18 nonprofessional disciplines, including eco¬ 
nomics. 1 The exception, anthropology, was also the only field to lack 
a measurable return to administrative service. Finally, 16 of the 18 
fields, including economics, failed to deliver significant salary incre¬ 
ments for good teaching. This consistency of rewards across disci¬ 
plines gives us confidence that the results of this study will not be 
peculiar to economics but will shed light on the incentives that exist 
elsewhere in academia. 

Other studies with smaller samples and more narrow foci have con¬ 
firmed and expanded on Tuckman’s findings. 2 Yet virtually all have 
imposed stringent assumptions on the data that make it impossible for 
them to address the concerns of this paper. The nature of these 
assumptions and the manner in which they are relaxed in this paper 
are discussed in Section II. Section III outlines the estimation proce¬ 
dure used to calculate the returns to differences in research quality 
and the relative return to coauthorship. Section IV contains some 
concluding remarks. 

II. The Measurement of Research Output 

A. Measurement Issues 

Previous studies of academic salaries typically estimate a linear equa¬ 
tion in which academic salary is regressed on articles published, “ex- 

1 Tuckman reported that the median number of articles published was in the 5-10 
category for 11 of the 18 fields. The figure was lower for music and higher for psychol¬ 
ogy and all the natural sciences. Further, in the natural sciences and psychology, a 
greater number of articles were required to obtain significant salary differentials than 
in other fields. Tuckman speculated that these differences were due to differential costs 
in producing an “article" (note that he did not distinguish coauthored articles). Merton 
(1973, pp. 470-74, 547) found that in the natural sciences, articles are shorter, more 
likely to be coauthored, and less likely to be rejected, which is consistent with Tuck- 
man's conjecture. 

* These include Siegfried and White (1973), Tuckman and Leahey (1975), Tuckman 
and Hagemann (1976), Hansen, Weisbrod, and Strauss (1978), Hamermesh, Johnson, 
and Weisbrod (1982), and Diamond (1986). 
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perience,” and other productivity variables. This study proceeds in a 
similar fashion, with one important difference. In the past, research¬ 
ers have assumed that a “publication is a publication," be it in Econom¬ 
ica or Econometrica, be it a note or a full-length paper, be it written by 
one author or two. These restrictions are relaxed in this paper by 
utilizing more detailed data. Issues relating to each of these factors 
are discussed below. 

Liebowitz and Palmer (1983) report results of a survey of depart¬ 
ment chairmen that indicate that chairmen ordinarily “assign a 
weight" to coauthored papers that exceeds I In (with n being the num¬ 
ber of authors), presumably to encourage collaborative research. But 
since economists are mobile, it is ultimately the market that would 
determine the relative value of coauthored papers, and it is not at all 
obvious what the market-determined value would be. 

In a market in which collaboration entailed no extra costs, one 
would expect to read very few single-authored papers if monetary 
rewards were as described above. On the other hand, if researchers 
were less productive at coauthored research than single-authored re¬ 
search, a reward scheme encouraging the former entails a consider¬ 
able cost in terms of forgone research output. Clearly then, the “equi¬ 
librium” weight for coauthored work is a function of production (and 
“taste”) parameters that are difficult to observe. Fortunately, one can 
estimate the sample weight for coauthored research by recording data 
on coauthored and single-authored productivity indicators sepa¬ 
rately, which is the approach taken in this paper. 

In addition, the following analysis recognizes the difference be¬ 
tween notes and full-length articles. While few would object to the 
assertion that the former are less valuable than the latter, drawing a 
line to separate one from the other would be inappropriate. Yet if one 
assumes that journal editors allocate space as value maximizers, it 
naturally follows that articles of greater length are more valuable than 
those of lesser length (on average). This is addressed in the data by- 
defining an individual's publication measure for each journal as the 
sum of pages published therein. Since page sizes vary between jour¬ 
nals, pages in each journal are adjusted to pages of American Economic 
Review equivalent size (AEQ pages). 

Finally, unlike the poetic rose, all 10-AEQ-page articles are not the 
same. Studies using citations in economics (Liebowitz and Palmer 
1984) and the natural sciences (Garfield 1972) indicate that articles 
published in a small core of journals account for the majority of 

3 Coauthored work with n > 2 authors was weighted at 21 n limes work with two 
authors. This was necessary because of the small number of papers written by three or 
four individuals. The Data Appendix discusses the implications of this weighting 
scheme. 
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references in the literature. 4 Liebowitz and Palmer (1984) constructed 
several rankings of economics journals based on citation frequencies. 
The ranking that “probably comes closest to an ideal measure of the 
impact on the economics profession of manuscripts published in vari¬ 
ous journals’* (p. 83) shows a sharp decline in the impact-adjusted 
citation frequency as one moves down the list. 5 6 For example, articles 
in the tenth-ranked journal received 36.45 percent of the impact- 
adjusted citations per character of the top journal; this figure is 28.06 
percent for the twentieth- and 7.15 percent for the fortieth-ranked 
journal. 

A possible method of adjusting for differences in journal quality 
would be to weight an article by the impact-adjusted citation fre¬ 
quency of the journal in which it is published. Although this adjust¬ 
ment may be too severe, it does raise the question of just how steep 
the quality gradient may be. This question is examined in detail in 
Section III. 


B. The Data Set 

The data set used in this study consists of 140 academic economists 
who are members of the associate or full professor rank at seven “top 
40” departments. These departments are those that responded fully 
to a request for salary information and vitae of senior faculty that was 
sent to the top 40 departments listed in Graves, Marchand, and 
Thompson (1982). The average rank of the departments in the sam¬ 
ple is 24. 

The data were tabulated in light of the issues discussed above. 
Measures of research productivity were based on information pro¬ 
vided on the vitae and in the Social Science Citations Index, with 
coauthored work recorded separately. 0 The data on each individual 


4 Citations have also been used to construct loose rankings of scientists. Quandt 
< 1976, p. 741) stated that citation counts permit “tentative predictions as to who future 
(Nobelj prizewinners will be.” Time has supported this conclusion. Quandt’s list of the 
26 most cited economists in 1970 contains seven subsequent prizewinners (among them 
the 1987 laureate) along with three who had been honored before the article was 
published. Similar figures for the natural sciences are reported in Garfield (1978) for 
both Nobel prizes and other highly regarded honors. 

5 This ranking is their table 2, col. 2, ranking. The impact adjustment refers to an 
iterative weighting scheme that gives greater weight to citations received from higher- 
ranked journals. 

6 The period for which citations were collected was 1976-82, which is something of a 
compromise between conflicting notions about the proper period for this analysis (see 
the Data Appendix for further discussion). Making an accurate count of the number ol 
citations and coauthored citations, given the multiple listings of most authors, required 
an undue amount of patience and attention to detail. The great care taken by M. C. 
Matheson in collecting the data is much appreciated. 
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TABLE ! 

Means and Standard Deviations of 1982 9-Month Salary and Personal 
Indicators for 140 Economists at Seven Universities 


Variable 

Mean 

Standard Deviation 

SALARY 

42.935 

10,797 

PAGES (a = 0) 

67.2 

73.8 

COPAGES 

48.2 

56.0 

CITES 

95.4 

184.9 

COCITES 

39.0 

68.9 

OPAPER 

17.2 

17.2 

CO-OPAPERS 

5.2 

6.8 

BOOKS 

1.0 

2.2 

COBOOKS 

.4 

.7 

EXPER 

17.7 

8.9 

YRAD 

1.1 

3.5 

QPAGES (a - a*) 

47.5 

60.0 

COQPAGES 

29.4 

38.2 

ARTICLES 

6.7 

7.1 

COARTICLES 

4.3 

4.9 


Noti.—QPAGES u Ihc .urn of ainglC',iUihorrd AEQ pagn in each journal weighted by where w f is the 
impact-adjusted citation frequency for journal /, and a* ii the optimal exponent obtained in Sec. III. 


consist of the following variables (see the Data Appendix for details): 
SALARY: 9-month salary for the 1982-83 academic year; PROD: a 
single-author productivity vector consisting of PAGES: AEQ pages in 
each of the top 100 journals ranked in Liebowitz and Palmer (1984), 
OPAPERS: other papers listed on the vitae, BOOKS: number of 
books written, and CITES: citations received to published work, 
1976-82; COPROD: the coauthored counterpart to PROD; EXPER: 
years since receipt of the Ph.D.; YRAD: years of administrative ser¬ 
vice; and DUMMY: a department-specific 0-1 dummy variable. 

Means and standard deviations for these variables (using the sum of 
PAGES for each individual) are listed in table 1. Also listed are means 
for the number of single- and coauthored articles (in ranked jour¬ 
nals), which are 6.7 and 4,3, respectively. 7 But how does this sample 
compare with the universe of publishing economists? One available 
yardstick is citation frequency. Liebowitz and Palmer (1983) compiled 
a frequency distribution of citations for over 3,000 academic econo¬ 
mists from more than 100 departments. Table 2 lists the number of 
economists from this sample within each percentile. It seems that the 
sample distribution encompasses a broad range of the profession. If 
anything, the top 20 percent of Liebowitz and Palmer’s tabulation is 


7 The journals most frequently published in by this sample are American Economt 
Ervinv, Journal cf Political Economy, Review of Economics and Statistics, Econometrica. and 
Quarterly Journal of Economics, accounting for 37 percent of all articles. 
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TABLE 2 

Average Citations per Year for the Sample Relative to the Profession 


Number 
of Citations 

Percentile 
for Profession 

Sample 

Frequency 

Sample 

Percentage 

a 167 

99 

1 

.7 

61 s*< 167 

95 

5 

3.6 

30 s * < 61 

90 

17 

12.1 

12s*< 30 

80 

27 

19.3 

6 s« < 12 

70 

21 

15.0 

4 as * < 6 

60 

13 

9.3 

2 Sx<4 

50 

22 

15.7 

1 s*<2 

40 

21 

15.0 

0 

< 40 

13 

9.3 


Nou-—Coaulhured ciuiiom arc weighted at \fn . 


overrepresented here, as one might expect since the sample is re¬ 
stricted to the top 40 departments. 

Before we go any further, it is useful to consider the possible effects 
of the sample selection process on the results. Perhaps the biggest 
concern is self-selection among the respondents. For the most part, 
departments in this sample keep vitae on file. Thus the chief criterion 
for inclusion in this study is the low cost of detailed information on 
faculty members. 8 Offers of information on self-selected faculty were 
turned down from two departments. However, one department vol¬ 
untarily sent such information, which was retained. In this case, a 
reasonable conjecture is that nonresponse is concentrated among the 
less productive. This is unfortunate if nonresponse entails a reduction 
in pages published in lower-ranked journals since it would make esti¬ 
mates of differential returns to quality less precise. 

III. Estimation of the Returns to Quality 
and Coauthorship 

A. Specification of the Model 

The two primary questions of interest are the relative return to qual¬ 
ity and the relative return to coauthorship of published research. 
These questions are addressed using an iterative nonlinear least- 
squares procedure to find the optimal adjustment of AEQ pages for 
differences in quality. 9 Quality-adjusted AEQ pages for each individ- 

8 Not all the schools in this sample are public. Chairs of some public departments 
gave as a reason for nonresponse the cost of gathering together the information; others 
were unwilling to report salary figures. 

9 Quality adjustment is made only on journal pages and not on citations. Where 
citations occur is probably significant (and expensive to record), although frequency 
and quality are likely to be more closely related here than with journal pages. 
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ual is given by 

100 

QPAGES - Yj Pi * <> 0Sa -l. 

where pj is pages and wj is the impact-adjusted citation frequency for 
journal j. The transformation wf, for 0 £ a s 1, is useful since it 
encompasses a wide range of alternatives. Using a = 0 results in no 
distinction between journals. For a = 1, the quality weights are simply 
the impact-adjusted citation frequencies, which would force an AEQ 
page in the fortieth journal to equal 7.15 percent of an AEQ page in 
the top journal. Intermediate values of a yield more modest declines 
as one moves down the journal rankings. The calculated QPAGES 
replaces PAGES in PROD and COPROD. 

The following equation was estimated using values of a = (0.00, 
0.05, 0.10, 0.15, .... 1.00) in sequence: 

LOG(SALARY) = INTERCEPT + (3, ■ PROD + 0 2 • PROD 2 
+ T • 0, • COPROD + T • 0. 2 ■ COPROD 2 
+ 0 ;1 • EXPER + 0 4 • EXPER 2 + 0 5 • YRAD 
+ 0 6 • YRAD 2 + 07 • DUMMY + e, 

where 0i is the estimated coefficient vector for single-authored re¬ 
search indicators (0 2 for the squared variables), and T is the estimated 
weight for coauthored research. The value of a that minimized the 
sum of squared residuals was taken to be the optimum. The resulting 
estimates are listed in the first column of table 3. 

These estimates all have the expected signs and are reasonably 
precise. However, regression diagnostics (see Belsley, Kuh, and 
Welsch 1980) identify five observations with excessive influence on 
the coefficient and variance estimates. Three of these are heavily cited 
"superstars,” whose effects are to dramatically reduce the coefficient 
estimate for CITES. 10 In the light of this evidence, it was determined 
to calculate the returns to quality using estimations from both the full 
sample of 140 and a restricted sample that excludes these obser¬ 
vations. 

The specification search outlined above was repeated with the re¬ 
stricted sample. The coefficient estimates for the regression equation 
employing the optimal value of a* = 0.30 are listed in column 2 of 
table 3. Note that for either sample a test of the hypothesis that a = 0 
against a = a* is rejected at the .01 level. 


The minimum effect of adding only one of these three observations to the remain¬ 
der of the sample is to reduce the coefficient estimate for CU ES by .001 (about the 
value when the remainder of the sample is used). 
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TABLE 3 

Coefficient Estimates of the Monetary Returns to Publication 


Variable 

Full Sample 

0) 

Restricted Sample 
(2) 

Page weight (a*) 

.15 

.30 

Intercept 

10.06 

10.08 


(150.9) 

(131.4) 

r 

.614 

.560 


(3-2) 

(3.2) 

QPAGF.S 

.0030 

.0033 


(4.6) 

(3.2) 

QPAC.ES* 

- 7.2E-06 

-1.4E-05 


(2.8) 

(2.0) 

CITES 

.0005 

.0021 


(2.2) 

(3.9) 

CITES* 

- 1.6E-07 

- 4.6E-06 


(.7) 

(3.3) 

B(X)KS 

.0014 

.0)22 


(1) 

(.5) 

BOOKS' 

.0005 

- .0047 


(.6) 

(•«) 

OPAPERS 

.0012 

.0011 


(.6) 

(S) 

OPAPERS* 

-1.6E-05 

- 1.5E-05 


(-6) 

(-6) 

EXPER 

.0178 

.0129 


(3.3) 

(1.9) 

EXPER 2 

- 2.5E-04 

1.3E-04 


(2.1) 

(.7) 

YRAD 

.0565 

.0560 


(4.9) 

(4.9) 

YRAD 2 

- .0024 

- .0024 


(3.8) 

(3.8) 

Standard error 

.0212 

.0198 

ft* 

.696 

.656 


Note —Numbers in parentheses are ^-statistics, f is the estimated weight for coautfKired research relative to 
single-authored research. Depart mem-specific 0-1 dummy variables (DUMMY) range from - .074 to .209 lor ihr 
col. 2 equation and are jointly significant at the .01 level. Pages in each journal are weighted tov (u^)"*, where w f is the 
impact-adjusted citation frequency for journal ) and a* is the optimal weight. 


It is clear from these estimates that citations and journal articles are 
the most important of the productivity indicators in determining sal¬ 
ary. The coefficient estimates for BOOKS and OPAPERS are both 
low and imprecise. This result is in accord with earlier studies and 
may reflect the relatively poor measurement of these research indi¬ 
cators. The returns to administrative service are sizable, although 
they decline very fast. Given the opportunity costs involved, it may 
pay to be the chairperson, but not for very long. 

The estimated T is 0.56 with a standard error of 0.18. Such a point 
estimate cannot tell us whether coauthors each receive returns more 
than 1/n times that of a single author (1 In - 0.50 here), yet it does 
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indicate that some form of discounting takes place. In particular, the 
maintained hypothesis that T = 1 that has commonly been used can 
be rejected at the .05 level. 11 Following Learner (1983), a battery of 
regressions were run to examine the sensitivity of the regression esti¬ 
mates to alternative specifications. The range of the T estimates was 
0.429-0.689 with a mean of 0.555. We thus know that the table 3 
estimate of T is not an outlier and can have some confidence in assert¬ 
ing that the weight for coauthored work is not much different from 
1/n. 12 


B. Calculation of the Quality Gradient 

Publication of an article appears to have a measurable impact on 
salary independent of citations. 13 The estimates in column 2 indicate 
that the incremental return from an AEQ page in the top-ranked 
journal (1.6 pages in the J.P.E.) is 0.17 percent of salary at the sample 
mean ($72.27 in 1982 dollars). The gradient is such that an article of 
equivalent size in the tenth-ranked journal returns (0.3645) 30 or 73.9 
percent of an article in the top journal. Subsequent benchmarks are 
45.3 percent for the fortieth- and 21.0 percent for the eightieth- 
ranked journals. 

The full return from publication includes the additional effects 
from being cited. Articles published in the top journal during the 
1975-79 period received 0.2522 citations per AEQ page in 1980 
(Liebowitz and Palmer 1984). With a constant citation rate over the 7- 
year “citation period" in this study, the average 10-AEQ-page article 
in the top journal would yield an increment of 17.65 citations, which 
are estimated to yield 0.12 percent each, for a 2.09 percent increase in 
salary. For other journals, citations are less frequent. The estimated 
returns from citations as a percentage of the top journal range from 
51.5 percent for the tenth-ranked journal to 17.2 percent for the 
eightieth-ranked journal. 14 

11 With a test based on the likelihood function, a 95 percent confidence interval for 
gamma was calculated as the range (0.29, 0.9S). 

* The alternative specifications used were all possible subsets obtained by («) delet¬ 
ing all variables other than QPAGES. CITES, and DUMMY and (6) repeating the 
procedure above using (unlogged) salary as the dependent variable. Note that fairly 
narrow ranges for the incremental returns from QPAGES and CITES in a were ob¬ 
tained; these were 0.0012-0.0017 and 0.0012-0.0019, respectively, at the sample 
means. 

13 One would naturally expect QPAGES and CITES to be highly correlated, which is 
indeed the case. However, an analysis of the variance decompositions suggested by 
Belsley et al. (1980) revealed that the coefficient estimates for QPAGES and CITES are 
not degraded by collinearity problems. However, there is a "moderate” linear depen¬ 
dency between the intercept, EXPER, and EXPER 2 . which may help explain the impre¬ 
cise estimates of the latter coefficients. 

14 See the Data Appendix for details on the construction of these estimates. 
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TABLE 4 


Estimates of the Quality Gradient 


Average Full Return 
to 10 AEQ Pages 
in the Top Journal 

Returns for Other Journals 
as a Percentage op the 

Top Journal 


No. 10 

No. 20 

No. 40 

No. 80 



Restricted Sample 


.0380 

61.6 

53.1 

34.1 

18.9 



Full Sample 


.0286 

74.3 

68.5 

53.0 

36.2 


Non:.—The full return includes the oilcuUfd return from QPAGES and expected CITES. Each row is based on 
the corresponding coefficient estimates in table 21. the quality weights xv*, and the expected citations for each journal. 


The full return to a 10-AEQ-page article in the top journal is thus 
estimated to be a 3.80 percent increase in salary ($1,602 in 1982 
dollars), which seems a sizable sum. For other journals, the return as a 
percentage of the top journal declines to 18.9 percent at the eightieth 
rank. This quality gradient is presented in greater detail in the first 
row of table 4. A gradient using the full-sample results is also pre¬ 
sented in this table. Both estimates for the full return in the top 
journal are large, and the gradients are steep. Once one moves below 
the twentieth- or fortieth-ranked journal, returns drop to roughly 
one-half the return from the top journal. These figures clearly indi¬ 
cate that there are significant monetary returns to high-quality re¬ 
search in the economics profession. 


IV. Concluding Remarks 

The estimates given in this paper are obtained from a sample of 
academic economists from seven of the top 40 departments in the 
United States. While estimates of the coefficients of interest are both 
reasonably precise and insensitive to alternative specifications, the 
calculated monetary values will not be representative of particular 
departments, much less other disciplines. However, the broad com¬ 
monalities in academic reward structures discussed above imply that 
qualitative inferences for other fields can be made from this study. 
Hence, one may infer that monetary returns to research quality in 
academia are measurably large and provide nonnegligible incentives 
to produce high-quality research. How effective these incentives are 
at influencing the behavior of scientists is a question that awaits fur¬ 
ther investigation. 
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Data Appendix 

This Appendix provides a more detailed description of the data collection 
and definition procedures. First, note that all coauthored work was recorded 
separately from single-authored work. Coauthored work with n > 2 authors 
was weighted at 2/n times work with two authors. Hence, for an article with 
three authors, each would receive % of what authors contributing to a two- 
author paper would receive. This means that if the coauthorship weight in 
the sample does not differ substantially from \ln (T = 0.50), the three co¬ 
authors would receive a monetary return of '/» that of a single author. 

Journal pages for articles listed on the vitae were recorded for each individ¬ 
ual. Only articles published in the 108 journals listed in Liebowitz and Palmer 
(1984) were recorded under this variable. More often than not, recording 
pages entailed looking up the journal in which the article was published since 
page numbers were infrequently listed on the vitae. Comments and replies 
were not counted. Articles in annual conference proceedings (including the 
A.E.R. Papers and Proceedings) were included in the OPAPERS category. 

All citations for the period 1976-82 other than self-citations and citations 
to textbooks were counted. Note that since the Social Science Citations Index 
attributes citations only to the first listed author, making an accurate count of 
citations to coauthored work often required searching for particular citations 
listed under a coauthor’s name. The 1976-82 period was used for two rea¬ 
sons. First, it is unclear whether a stock of lifetime citations or the flow of very 
recent citations is the appropriate variable to use. Selecting a 7-year period 
amounts to a compromise on this issue. Second, this period is convenient 
since the Social Science Citations Index has a 5-year cumulative volume for 
1976-80, which substantially reduces the cost of looking up citations for 
several years. The years 1981 and 1982 were also included since current 
citations are likely to measure current influence. 

Any working paper or publication not listed in Liebowitz and Palmer 
(1984) was classified in OPAPERS. Similarly, any published book other than a 
textbook was counted in BOOKS. The varying qualities of the members of 
these variables may be partly responsible for the lack of precision in their 
estimated coefficients. Adopting a more rigorous classification scheme for 
these research indicators may yield more informative estimates but is beyond 
the scope of the current paper. 

The ranking used in the paper is based on impact-adjusted citation fre¬ 
quencies. To calculate monetary returns from expected citations, estimates of 
(unadjusted) citation frequencies were used rather than actual measures. This 
was done to avoid problems stemming from the nonmonotonic relationship 
between the two actual measures. However, the differences between quality 
gradients based on the estimated and actual citation frequencies are very 
minor. Note also that Oromaner (1981, p. 89) and others have found that the 
frequency of citations to an article is “quite stable during the first (post¬ 
publication) decade.” Hence, the simple method used to estimate the incre¬ 
ment of citations due to publication of an article (cites per page times pages 
times years) has empirical justification. 
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Tournament contracts are characterized that simultaneously elicit 
first-best efficient effort levels and self-selection of risk-neutral 
heterogeneous workers into ability-specific contracts. Comparisons 
across self-selected ability types are shown to be sometimes necessary 
to attain efficiency. Rationales are explored for not commonly ob¬ 
serving such contracts in hierarchical organizations. 


I. Introduction 

The design of optimal mukiagent incentive contracts has been the 
focus of much recent research on moral hazard, that is, contractual 
inducement of unobservable actions taken by agents, given contracts 
designed by principals. Following the early work of Lazear and Rosen 
(1981) and Bhattacharya (1982), rank-order tournament contracts in 
particular have been extensively analyzed as contracts that result in 
adjusting for the effects of common (correlated) shocks on the ob¬ 
served performance monitors of all agents. The second-best optimal¬ 
ity of tournament contracts in this context has been proved for a 
restricted scenario in which marginal productivity of effort is not 
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affected by common shocks (Green and Stokey 1983), whereas Lazes 
and Rosen (1981) and Nalebuff and Stiglitz (1983) have compare 
tournaments with linear piece-rate contracts. The limitations of sue 
comparisons have been noted by HolmstrOm (1982). Bhattachary 
(1983) has rationalized the use of tournaments even without co 
related productivity shocks, but with principal’s moral hazard, am 
compared their performance with that of termination-based incer 
tives schemes. 1 

Given the rapid pace of developments in this area, particularly o 
results associated with a scenario of a large set of ex ante homogc 
neous agents, the original empirical context that perhaps motivate 
the pioneering work of Lazear and Rosen has become somewhat ovei 
looked. Specifically, except for state lotteries and football pools, tour 
nament contracts are largely observed in the economic context c. 
hierarchical organizations, in particular, the empirically observe 
contests each typically have a restricted set of participants, far smalle 
than the size of the organization, that, at the top, is thought of as th 
principal. Also, in most observed contests, agents who are compare 
with each other in order to determine wages and promotions work i 
the same or highly contiguous levels of the hierarchy in question. 
Most analytical models of hierarchies (e.g., Rosen 1982) would thu 
suggest that the inherent ability levels of participants are (approx ; 
mately) homogeneous within contests and heterogeneous acros 
them. We seek to explain such “within-cohort” tournaments endoge 
nously. 

In their early work on these topics, Lazear and Rosen (1981) at 
tempted to deal with the issue of the performance of tournamen 
contracts given asymmetrically informed heterogeneous agents. The’ 
claim to have shown that even with risk-neutral agents, tournamen 
contracts cannot attain the first-best, ex ante optimal, allocation incen 
tive-compatibly, although they can do so for ex ante homogeneou 
agents, given plausible assumptions about the probability distributioi 
of monitoring noises. We first show in Section II that if wages can b 
made contingent on performance contests and comparison acros 
agents with different abilities (which are revealed through self 
selection among wage contracts) is allowed, then the Lazear-Roser 
result can be easily reversed. However, if wages have to satisfy limitec 
liability constraints, then comparisons across ability types rather that 

1 Related observations of a more circumscribed nature can be found in Carmichr 
(1983) and Malcomson (1984). The essential feature of some tournament contracts i 
that the principal’s total wage bill is predetermined. We do not focus on this featur 
here. 

s One may rationalize that such ordinal comparisons are relative, rather than witl 
respect to some absolute output standard or test, on the grounds of either (i) impor 
tance of correlated shocks across agents or (ii) the ephemerality of "output" measures 
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within such cohorts might result in tighter constraints for the firm. 
For example, if each self-selected ability type has to be paid at least its 
expected marginal product in all future contingencies, assuming a 
“no-quitting” constraint arising from interfirm competition ex post, 
initial wages must be below first-period expected marginal products. 
We show through an example that if performance comparisons across 
ability levels are used to bring about the ex ante optimal self-selection 
of ability types into contracts, as well as optimal effort levels, then the 
degree to which failing wages lie below marginal products is in¬ 
creased. As a result, limited liability constraints are more likely to be 
violated. Our analysis here extends the insights of Bhattacharya 
(1980) and Guasch and Weiss (1980, 1982) to a scenario with endoge¬ 
nous choice of effort levels. 

In Section III, we briefly consider a different set of issues associated 
with heterogeneity of agents that arise in integrating the theory of 
incentive cum sorting schemes with that of hierarchies with interac¬ 
tive productivity effects across levels. Most existing models of hierar¬ 
chies (e.g., Calvo and Wellisz 1979; Rosen 1982) that focus on the 
greater “percolation" of the productivities of higher-ability agents 
when placed in higher levels of the hierarchy typically ignore the 
simultaneous endogenization of incentive schemes that elicit any 
asymmetrically known information about abilities. It is easy to see that 
if rewards can be provided only by promotions across task levels in 
hierarchies, then noisy performance comparisons across levels may 
lead ex post to a situation in which an agent of lower ability is placed 
at a higher level, resulting in a loss of organizational efficiency. In our 
work here, we focus on a different set of issues. We do not assume that 
promotions across meaningfully different task levels are the only 
means of rewarding agents, but explore instead the issues in effi¬ 
ciency attainment that arise solely because of interactions among the 
productivities of multiple types of jobs and agents. The resulting 
rationalization of within-cohort performance comparisons is comple¬ 
mented by the earlier one based on nonnegativity of wages. 

The paper is organized as follows. In Section II, we take up the 
issues connected with asymmetrically informed heterogeneous agents 
and limited liability constraints. Connections with hierarchical organi¬ 
zations and the implications for efficiency attainment with tourna¬ 
ment contracts are discussed in Section III, which also concludes with 
suggestions for future research. 

II. Efficient Contracts with Agent Heterogeneity 

In the first subsection we show how, by proper design of contracts, 
efficiency can be attained with rank-order tournament schemes given 
asymmetrically known heterogeneity of productive agents. In the sec- 
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ond subsection we analyze how limited liability constraints can bias the 
design of tournaments toward those based on comparisons within 
cohorts. 

A. Choice among Comparison Standards 

As in Lazear and Rosen (1981), we assume that the ex post output 
(performance indicator) dj for the jth agent satisfies 

+ e jt (1) 

where p., is the effort of j, unobservable to the principal, and tj is a 
random variable with distribution function F(e), identically and inde¬ 
pendently distributed across agents. We assume that ij has mean zero 
and bounded variance and that the support of F(e) is unbounded 
below/ Principals are competitive firms, and they, as well as all agents, 
are risk neutral. An agent putting in effort ft generates the expected 
product I'p. for the principal independent of the number of other 
agents, that is, constant returns to scale. In the case of ex ante homo¬ 
geneous agents, which we recapitulate briefly, all agents have utility 
functions given by 

Zj = E{W,) - Cfci.,), (2) 

where W, is the jth agent's random performance-contingent wealth, 
£(•) is the expectations operator, and C(-) is the common effort- 
disutility function. It is assumed that the principal can make W, con¬ 
tingent only on a relative ordinal comparison of d, with d k for some k, 
that is, Wj = where /, = 0 if d } 2: d k and 1 otherwise. The 

distribution function for {e ; - e k ) is given by G(e), which is assumed to 
be twice differentiable with the density function g(e), and its derivative 
g'(e). By replacing G(e) with Fit), our results can be reinterpreted in 
terms of absolute “test standards.” 

We assume that C(p) is strictly increasing, convex, and twice differ¬ 
entiable and satisfies the Inada conditions that as p -* 0, C'(p.) —> 0, 
and as p-* CJ'(p) —* M > V. In line with Lazear and Rosen (1981). 
in particular their claim that heterogeneity precludes efficiency ai- 
tainment with tournaments, we also assume that the density function 
g{e) is strictly unimodal at e = 0 and also that g(e) is symmetric. Thes< 
assumptions are sufficient to ensure that problems with multiple Nasi 
equilibria do not arise with a tournament scheme. Specifically, we ar 
assuming that g(e) = gi~e) and that g'{e) S 0as e 5 0. After presen 
ing our main result, we shall discuss relaxing this assumption, as wc 



* This rules out simple individualistic "forcing contracts.” 
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as the additive separability (in p and e) of the structure. The welfare- 
optimal (first-best) effort level p* is clearly characterized by 

V = C'(p*). (3) 

The ex ante homogeneous agents all solve the following problem in 
deciding on effort level |i £ D = [0, «): 

max Z(p, p f ) = W, - (Wi - W 2 )G(p' - p) - C(p), (4a) 

|i£/) 

where = W t equals the 7 th agent’s wage if /, = 0 , W 2 j = W 2 is his 
or her wage if lj — 1 , and p f is the conjectured effort level that agent j 
perceives his “rival" k to be exerting. Competitive attainment of first- 
best Pareto efficiency is feasible if and only if there exist and W 2 
satisfying 

p* = argmax Z(p, p*) (4b) 

w 

and the competitive zero-profit condition is satisfied, that is, 

VP, + 

Vp* = W t - (W, - W 2 )G(0) = ~-!~ 2 —“• (5) 


Taking first-order conditions in equation (4a), we obtain the neces¬ 
sary condition for (4b) that 


(VP, - W 2 )g(0) = C'(p*) = V, 
which together with equation (5) implies that 

(6a) 

Wi - Va* + V , 

1 ^ 2g(0) ’ 

(6b) 

V 

IV., - Vu,* - . 

2 K 2g(0) 

(6c) 


To ensure that the first-order condition (6a) is sufficient, we examine 
conditions such that Z(p, p*) is strictly concave in p, which implies 
that -(W] - W 2 )g'(p* - p) - C"{ p) < 0, or using (6a), it is sufficient 
to assume that for all e 


1 




> -inf 
wen 


C"(p) 
V ' 


(bd) 


where the right-hand side of (6d) is assumed to be strictly negative. 4 
; Note that strict convexity of C(p) eliminates symmetric Nash equilib¬ 
ria (among agents j and k) with p 5 * p*, and symmetry of g(e) elimi- 


: 1 This formalizes Lazear and Rosen's statements that concavity follows from distribu- 
ions with sufficiently high variance. 
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nates asymmetric Nash equilibria with p, # p* since (Wi - - 

p*|), Hy * p*, cannot simultaneously equal C'(y?) and C'Cp*). 

Lazear and Rosen (1981) then argue that if there are two types of 
agents, a and b, and a always has a greater marginal disutility of effort 
C'„(v-) > then the tournament contracts constructed to achieve a 
first-best equilibrium cannot also satisfy the self-selection constraints. 
In symbols, if {W,, W 2 } satisfy (6b) and (6c) for p*—where C^,(p*) = 
V, Cj(p?) = V, p* < p*—and similarly {Vi's, VV 4 } satisfy (6b) and (6c) 
for type b, then it is clear that 

Ws - Wy - V(tf - p,J) > JL [G(p# - p) - G(p* - p)] 

by strict unimodality of g(e) at e = 0. Thus, a type a agent is better off 
in expected utility with the type b contract because the differential 
(“winning”) wage strictly exceeds the higher expected pen¬ 

alty from a stiffer “performance standard,” at any level of effort p. 
Thus efficient contracts based on comparisons within cohorts are not 
sustainable as an equilibrium since lower-ability types are better off 
joining the tournment designed for the more able type. 5 

To resolve this self-selection problem, we consider a different set of 
contracts that are based on ordinal performance comparisons across 
self-selected cohorts. Each agent’s ex post output is compared with 
that of an arbitrary member of the self-selected cohort with the lowest 
efficient p*. For generality as well as ease of exposition, we consider a 
continuum of types n G [a, i] having effort disutility functions C(p, n) 
that are twice differentiable in both arguments. The C( p, n) function 
satisfies monotonicity, strict convexity, and Inada conditions in the 
former argument as well as condition (6d), with d 2 C(p, n)/dpdn < 0. 
The type-specific first-best efficient effort levels are now given by p*. 
V = dC(pJ, n)/dp, and pjf: [a, b]-* D ** [0,») is strictly increasing and 
differentiable in n. We also define the “active” subset of D in an effi¬ 
cient equilibrium as A * [pj, p£]. 

Our problem of effort cum ability elicitation has a competitive ex 
ante efficient solution if there exists a set of functions {y*(n), W(y)} 
such that, for all n G [a, b], the domain of y*(n), 

{ y*(n), p?} = argmax Z"(p, p*, y, W(y)), (7a) 

(i 6 D.y £ Y 

where Y is the range oiy*(n), and Z”(-) is the type n agent’s expected 
utility given (i) own effort level p, (ii) conjectured rival’s effort level 

s Lazear and Rosen also consider mixed (pooled) tournaments, in which a given {W'i- 
Wj) wage tuple is used to reward or penalize in contests across randomly chosen abilitv 
types. They show that efficient effort levels are attained only when there is an equal 
proportion of each type. Furthermore, mixed contracts are also vulnerable to competi¬ 
tion because firms would offer contracts that lure type b agents away. 
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p*, (iii) wage conditional on choosing the y contract of W(y) provided 
/ = 0, and (iv) if / = 1, wage W(y) - y. Notice that (7a) requires 
incentive compatibility in both contract and effort choice. Further¬ 
more, for the contract set Y, expected payoff must equal expected 
product for each ability type; that is, for ail n E [a, b], 

Vtf - C(p* n) = Z"(p*, n*, y*(n), W(y*(n))). (8) 

In what follows, Z"(p, pj, y, W{y)) is explicitly written to reexpress 
(7a) as 6 

(?*(«). M-*(n)) S argmax (W(y) - yG(p* - p) - C(p, »)]. (7b) 
y. e d.j e r 

It is helpful to view the problem of satisfying (7b) and (8) in two 
parts. Define the associated self-selection problem to be solved if there 
exist {y*(n), W(y)} satisfying, for all p£ E A = [pj, p|], 

y*(«) E argmax [W(y) - yG(p* - p*)] (9) 


as well as equation (8), where Y is again the range ofy*(n). Proposition 
1 arises from Guasch and Weiss (1980) and Bhattacharya (1980). 

Proposition 1. Strictly increasing and differentiable (y*(»), W(y)} 
satisfying equations (8) and (9) exist if and only if G(pJ - p) is strictly 
convex in p for p E A. 

Sketch of proof. First-order conditions in (9) give us, for all n E [a, b], 


dW{y*{n)) 

dy 


G(p? - pj) = 0. 


(10a) 


Totally differentiating equation (8) using the definition of Z”(-) in (7b) 
gives us 


Vdp* _ dW(y*(n)) dy*{n) „. v dy*(n) 

__ _ _ G(p 0 p„) dn 


+ y*( n )gi - rt) 


Combining this with (10a), we see that y*(n) is defined by 

V = y*(n)g(pj - pjf) (10b) 

so that, given g'(e) > 0 for e < 0, we have that y*(n) is strictly increas¬ 
ing. The second-order conditions in (9) are shown to be satisfied by 


6 Equations (7b) and (8) are necessary and sufficient for the first-best to be attainable 
as a Nash equilibrium across competing employers. Necessity arises because contracts 
that cross-subsidize across different types cannot be Nash equilibria with uninformed 
employers “moving first" in the extensive game, whereas sufficiency arises because of 
the lack of deadweight losses given risk-neutral agents (e.g., Guasch and Weiss 1980). 
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totally differentiating (10a), given our assumption on g{e)\ W(y) is 
constructed using (8) and (10b). Q.E.D. 

Given proposition 1, it is now easy to extend the argument to show 
that the same {>*(«), W(y)} functions also solve the self-selection cum 
moral hazard problem of equations (7b) and (8). The key step is 
extending the domain of p to Z) = [0, *>) rather than A = [ft*, pj], We 
first note that, given {y*(n) £ Y, W(y)}, proposition 1 implies that 
choice of p > p# can, at most, result in obtaining expected payoff Vp 
since g’(e) > 0 for e < 0. 

Proposition 2. (y*(n), W(y)} satisfying equations (8) and (9) also sat¬ 
isfy equation (7b) and thus solve the moral hazard cum self-selection 
problem. 

Proof. For any agent of type n, consider first his choice of p £ A = 
IpJ, *>) rather than D in (7b). Clearly, by proposition 1 over this 
restricted domain he will choose (p, y*(n)} to maximize in (7b) by 
choosing 

p* = argmax ff'p - C(p, n)]. 

|k€ A 

Now. consider p < p*. first for agents of type a. For them, conditional 
on choosing the {y*(a), W(y*(a))} contract, the problem of choice over 
p is the same as in the ex ante homogeneous agents’ problem. 7 Since 
equation (fid) holds, this subproblem is strictly concave in p and thus, 
for all p < pj, 

5Z“(p, p*, >*(<*), W(y*(a)» ^ Q 
dp 

Now, for n > a,y*(n) >y*(a ) by proposition 1, and dC( p, n)/d p < dC(p, 
a)/dp at any p. Hence, it follows from the definition of Z n (p, pj, y, 
W(y)) in (7b) that, for any type n E [a, b] and any feasible contract in Y 
and p < pjf, we must have that 

dZ"(p, p &y, W(y)) > 

dp 

This last observation ensures that the solution to the right-hand side 
of (7b) lies in the subdomain A and thus satisfies (7b). Q.E.D. 

In the additive noise case, if G(e) is not convex over the entire 
subdomain ( —», 0), it could still be possible to construct a set of 
contracts with differing comparison standards as well as wage tuples 

7 It should be clear that a Nash equilibrium in which type a agents choose y * y*(a), 
and hence p, > p?, is Pareto inferior for all agents since any choice of p > p„ now 
results in maximized pecuniary expected payoff lower than Vp because of the higher 
comparison standard. Note also that weakly convex 0(e), e < 0, implies non-fully- 
separating >*(n). 
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that solve the associated seif-selection problem. The essential require¬ 
ment, developed in Guasch and Weiss (1982), is that there exists e* s 
- (n# ~ m£) such that g(e) is greater for e in [«*, 0] than g(e) for e in 
[2e*, e*]. We develop a related theme and its implication for within- 
versus across-cohort tournaments further below. 

The implication of the preceding analysis is that tournaments 
across cohorts are superior to tournaments within cohorts on effi¬ 
ciency grounds. Nevertheless, most of the empirically observed con¬ 
tests typically each have a restricted set of participants that is far 
smaller than the size of the organizations, and the agents who are 
compared with each other perform similar tasks or work in the same 
or a highly contiguous level of the hierarchical structure of the firm. 
It is this “anomaly” that we now seek to resolve. 

B. Limited Liability Constraints 

Here we suggest an argument that might bias the choice of tourna¬ 
ment design toward those based on comparisons within cohorts. It 
relies on the existence of nonnegativity constraints and can be staled 
as follows. Assume a desired “no-quitting” constraint arising from 
interfirm competition ex post. Then, since self-selected ability types 
have to be paid at least their expected marginal product in the future, 
initial wages must be below first-period expected marginal products. 
If performance comparisons across ability types are used to elicit self¬ 
selection of ability types into contracts and effort levels, the degree to 
which first-period wages lie below marginal products may be in¬ 
creased. As a result, the limited liability constraint (nonnegative 
wages) is more likely to be violated than in tournaments in which the 
comparisons are within ability types. In a single-period concurrent 
payment framework, limited liability constraints require a nonnega¬ 
tive failing wage, and an exactly analogous argument applies. Specifi¬ 
cally, when a high-ability type competes against a low-ability type, he 
is very likely to win. As a result, he is given a tiny payment if he loses 
and a large prize if he wins. The large spread is not of concern 
because he is risk neutral, but it deters the low-ability type from join¬ 
ing that contract. The small losing wage may potentially violate a 
limited liability constraint. By comparison, when an agent competes 
with an agent of symmetric ability, the prize is much smaller, and thus 
there is less worry about the limited liability problem. For simplicity, 
we focus on this static case; the analogy between it and the mul¬ 
tiperiod case is made precise in Bhattacharya (1980) and Guasch and 
Weiss (1980, 1982). 

To formalize this argument, we should compare the equilibrium 
losing wages for each type under both regimes of tournaments within 
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and across cohorts. We have already characterized the equilibrium for 
the tournament based on comparisons across cohorts. However, in 
the continuum of types case and given the additive monitoring noise 
structure in equation (1), an equilibrium set of tournaments based on 
comparisons within cohorts does not exist. 8 Because of this nonexis¬ 
tence problem, noted in Lazear and Rosen (1981), it is not possible to 
establish the relationship of the losing wage for each type under these 
assumptions. However, by slightly altering the framework to accom¬ 
modate nonadditive monitoring noise structures, we shall construct 
an example in which an equilibrium exists under both regimes and 
thus comparisons of failing wages across regimes can indeed be made. 

Given the similarities between tournaments and absolute “test stan¬ 
dards,” and for tractability of the probability distributions used in our 
example, we shall explicitly consider tests rather than tournaments as 
devices to elicit effort and sort the types. We shall characterize the 
equilibrium under two regimes: the first is induced by a single test or 
standard that corresponds to making comparisons across cohorts 
since all agents are compared with the same standard; the second 
equilibrium is induced by a regime that utilizes type-specific tests in a 
way that corresponds to making comparisons within cohorts. 9 We 
assume that for any effort choice p, (unobservable to the firm) the 
performance indicator for the jth agent, d ; {p,), is distributed uni¬ 
formly in the interval [0, 2j*jj]. All other assumptions are as in Section 
IIA above. Contracts are baited on /,• € {0, 1}, ly - 0, if and only if dj 2: 
K, where K is the test standard. 

Single Test Equilibrium 

Let K be the standard chosen and (W(y), [W(y) - y]} be the passing 
and failing wages associated with the y contract. The probability den¬ 
sity function of the performance indicator d of an agent choosing 
effort level p. is l/2p. for Osds 2p.. Then an agent choosing contract 
y and contributing effort level p receives W(y) if d te K and W(y) - y 
if d < K. The probabilities of those two events are (2p - AT)/2|x and 
K/2p, respectively. 

An equilibrium solving the moral hazard cum self-selection prob- 

8 Basically, there is not a set of functions {W(y), y(n), p(n)} that solve the problem 

(p(n), y(n)} = argmax (W(y) - yC[p(n) — |x] — C(p, n)] 
m* 

and 

W(y(n)) - = V*(n). 

9 The need for this simplification arises solely because otherwise we would have to 
make distributional assumptions about (Aj - A t ) rather than about {Aj, J*}, which (in the 
nonadditive noise case) makes for a better primitive assumption. 
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lem in this framework is a set of functions {y*(n), W(y)} with range 
y*(n) — Y such that, for all n 6 [a, b], 

(y*(n), p*) = ^ argmax ^ jw(j>) - y - C(p, n)j (1 la) 


and 


V\i* ~ W[y*(n)) - y*(n) 


(Hb) 


where V ~ C'(pjf). The rationale for this to be an equilibrium is the 
same as in the tournament case developed fully above. Solving for the 
equilibrium in a similar fashion, we obtain the full characterization of 
the equilibrium set of contracts: 

W(y*(n)) = 2^*, (12a) 

W(y*(n )) - y*(n) = 2Vp*(l - -?pj. (12b) 


Given equations (12a) and (12b), W(y) is recovered using the inverse 
function of y*{n). Thus the efficient allocation is attained. 

For the solution to (11a) and (l lb) to be well defined, the probabili¬ 
ties of passing the test have to be nondegenerate. That imposes a 
constraint on K. Specifically, K < 2p£ for all n, and since, for i > j, p? 
> p*, that constraint reduces to K < 2j*J, where a is the lowest-ability 
type. However, given equation (12b) and nonnegative wages, it has to 
be that for all n. Thus an efficient single test or testing across 

cohorts is implementable if and only if 2p? > p*, where b is the 
highest-ability type. Moreover, in this example the resulting equilib¬ 
rium wages are nested so that W b > VV° > W a - y a > W b - y h . Thus the 
highest-ability types are the ones for whom limited liability constraints 
are more likely to be violated. 


Type-specific Tests 

Let K(y) be the standard selected for the test contract with penalty y. 
The probability of failing test y of an agent supplying effort p is 
K(y)l 2p. An equilibrium solving the moral hazard cum self-selection 
problem here is a set of functions {W(y), y*(n), K(y)} such that, for all n 
e [a, b ], 

{?*(«), p*} = argmax fw(y) - - C(p, n)l (13a) 

(lero.-^er L *P J 

and 

W{y*(n)) - y*(n) - W ( I3b > 
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The solution to this problem is given by 


_ y*(n)K(y*(n )) 
2 |x? 2 


(14) 


with W(y) given by solving (18b) and (14) for a given K(y). Notice that 
this regime gives rise to the efficient allocation as well with one degree 
of freedom in the choice of K(y). For example, let y*(n) = 2V(i*; then 
K(y*(n)) = |jl* for all n. In this case, W(y*(n)) = 2Vp£ and thus W(y) 
- y = 0. Here, the probability of passing the test is Vt when each type 
self-selects, This case is the analogue of the tournament with compari¬ 
sons within cohorts (see n. 9). In our example, these type-specific 
contests dominate fixed contests since the former are always imple- 
mentable, while the latter are not since they violate limited liability 
constraints when 2p? < 


C. Risk Aversion 

To conclude this section, let us consider how the results might be 
affected if the agents were risk averse. It should be clear that it is not 
easy to extend the clean results in propositions 1 and 2 to a second- 
best scenario with risk-averse agents when moral hazard problems 
alone make it impossible to attain unconstrained Pareto efficiency. 
There are two difficulties involved. First, one can no longer take the 
“target” set of constrained efficient allocations from the homoge¬ 
neous agents case, for, as we have seen, optimal contracts with hetero¬ 
geneity may well involve comparisons across self-selected types. Sec¬ 
ond, it is, in general, difficult to characterize the solution to the 
associated self-selection problem for an arbitrary set of effort levels. 
As Rothschild and Stiglitz (1976) have noted, in a second-best 
scenario with risk-averse agents, the competitive adverse selection 
problem need not have a pure strategy Nash solution when the unin¬ 
formed principals “move first” in the extensive form game. If a solu¬ 
tion exists, then across-cohort comparisons will not necessarily be wel¬ 
fare decreasing (risk increasing) because the resulting higher wage 
differentials in contracts are compensated by the greater (than half) 
probability that more able agents will win. 


III. Tournaments in Hierarchical Organizations: 

Some Issues 

In the Introduction, we suggested a correspondence between the two 
different types of contracts considered above and incentive schemes 
in hierarchies that compare agents within or across different levels of 
the hierarchy. The essence of this correspondence is derived from the 
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observation that in most extant models of hierarchies (e.g., Rosen 
1982), higher-ability agents are placed in higher levels of the hierar¬ 
chy. The basic reason for this is the “downward externality” that more 
able agents generate for the productivity of agents in lower levels, for 
example, through more efficient effort supervision or information 
processing on tasks. In this section, we note some issues that arise in 
integrating the theory of incentive cum sorting schemes discussed 
above with that of hierarchies with interactive productivity effects 
across levels. A brief example suffices to make the relevant points. 

In many models of hierarchies (e.g., Calvo and Wellisz 1979), the 
function of higher levels is supervision and thus the creation of effort 
incentives for lower levels. While this role is no doubt important, it is 
analytically difficult to integrate this form of incentive creation with 
those created by multiagent contracts. Instead, we assume that agents 
have (marginal) productivity levels that are “inherently” functions of 
(i) their effort level and (ii) their placement in a hierarchy. 10 For 
simplicity, let there be two levels of ability, a and b, and two levels of 
jobs, A and B, a < b and A < B. The marginal product coefficient of 
workers in jobs is given by the quadruple {n(a, A) s n(a, B) s n(b, A) 
< n(b, B)}, where by marginal product coefficient we mean that work¬ 
ers have ex post observable performances n(-, *)M- + G with effort 
disutility functions C a (\i) and C*(p), CMn) < as before. For 

example, workers of different abilities may differ by “task capacity,” 
so that 


n(a. A) = n(a, B) = n(b,A) = I (15a) 

as before but 

n(b, B) = (A). (15b) 

For the specification in equations (15a) and (15b), it is no longer 
clear that Lazear and Rosen’s (1981) claim—that cohort-specific tour¬ 
naments based on {p, + e,} comparisons will not be incentive compat¬ 
ible—is valid any longer. The reason is that a type a worker who 
contemplates taking on the contract for the type b will find that his 
relative productivity at a given level of (disutility of) effort is lower in 
a type B job than that of the type b worker. 

Formally, the contract structures in the two cohort-specific tourna¬ 
ments will now have winning versus losing wage differentials of VFj - 

lu For example, a higher-level manager may be better able to process information 
about demand shocks and, thus, the optimal employment level at the lower level if he is 
of higher ability. Given diminishing marginal product of labor and fewer higher- than 
lower-level workers, the higher-ability managers' contribution to the firm is compara¬ 
tively greater when they are placed at a higher level. 
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W 2 = v/g(0) and W$ - = (bid) • [V/g(0)], respectively, and in 

contemplating switching to the type b contract, the type a agent will 
not do so if for his associated optimal p 

(16a) 

Equation (16a) is derived using the analogues of equations (6b) and 
(6c). The inequality in (16a) may hold even though g(e) is (symmetric 
and) strictly unimodal at g(0) since that only implies 

Hi -•*■*) > ^ [ G (i «*» - <*) - - 4 <'6b) 

However, given (15a), the inequality in (16a) will not imply that the 
reverse violation of incentive compatibility, with type b agents prefer¬ 
ring the type a contract, will occur either (as it does not in Lazear and 
Rosen [1981]). 

Now, a job-specific productivity feature, as in (15a) and (15b), 
which was analyzed in Bhattacharya (1980), will in general also make 
it more likely that across-cohort tournaments will produce efficient 
allocadons. For example, in contrast to our result in proposition 1, we 
need not require g'(e) > 0 for e < 0 to solve the associated self¬ 
selection problem. However, as before, across-cohort comparison 
contracts will result in higher winning-losing wage differentials than 
within-cohort contracts, and nonnegativity of wages is more likely to 
bind as a constraint. 

Hierarchies also introduce an additional question. If the perfor¬ 
mance measurement errors {£,} are correlated across agents, are these 
correlations—which are beneficial for the principal in creating effort 
incentives—likely to be greater in within- versus across-cohort com¬ 
parisons? The answer to this question depends very much on whether 
or not across-level demand or productivity shocks, versus monitoring 
errors that may be more correlated within a cohort (hierarchical 
level), dominate in determining {«,}. 11 The nature of the organization 
of the hierarchy along “functional” versus “product” lines will also 
affect the detailed answer. These observations largely serve to point 
to the inadequacy of existent models of tournaments in dealing with 

11 For instance, the accounting rate of return comparisons across divisional profit 
centers may be erroneous by a common factor when inflation biases historical cost or 
inventory data at all divisions similarly. In contrast, productivity shocks may differ 
across divisions or products but be highly correlated across hierarchical levels within a 
division. 
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the richness and complexity of hierarchical organizations and to the 
importance of the remaining tasks in integrating the theories of in¬ 
centive schemes and hierarchies. 


References 

Bhattacharya, Sudipto. “Nondissipative Signaling Structures and Dividend 
Policy.” Q.J.E. 95 (August 1980): 1-24. 

-. “Aspects of Monetary and Banking Theory and Moral Hazard.” J. 

Finance 37 (May 1982): 371-84. 

--. “Tournaments and Incentives: Heterogeneity and Essentiality." 

Working Paper no. 695. Stanford, Calif.: Stanford Univ., Grad. School 
Bus., March 1983. 

Calvo, Guillermo A., and Wellisz, Stanislaw. “Hierarchy, Ability, and Income 
Distribution ."J.P.E. 87, no. 5, pt. 1 (October 1979): 991-1010. 
Carmichael, Lome. “Firm-specific Human Capital and Promotion Ladders." 

BeUJ. Econ. 14 (Spring 1983): 251-58. 

Green, Jerry R., and Stokey, Nancy L. “A Comparison of Tournaments and 
Contracts.” J.P.E. 91 (June 1983): 349-64. 

Guasch, J. Luis, and Weiss, Andrew. “Wages as Sorting Mechanisms in Com¬ 
petitive Markets with Asymmetric Information: A Theory of Testing." Rev. 
Econ. Studies 47 (July 1980): 653-64. 

-. “An Equilibrium Analysis of Wage-Productivity Gaps.” Rev. Econ. 

Studies 49 (October 1982): 485-97. 

Holmstrftm, Bengt. “Moral Hazard in Teams.” Bell J. Econ. 13 (Autumn 
1982): 324-40. 

Lazear, Edward P., and Rosen, Sherwin. “Rank-Order Tournaments as Op¬ 
timum Labor Contracts.” J.P.E. 89 (October 1981): 841-64. 

Malcomson, James M. “Work Incentives, Hierarchy, and Internal Labor Mar- 
kets :'JJ>.E. 92 (June 1984). 486-507. 

Nalebuff, Barry, and Stiglitz, Joseph E. "Prizes and Incentives: Toward a 
General Theory of Compensation and Competition.” Bell J. Econ. 14 
(Spring 1983): 131-47. 

Rosen, Sherwin. “Authority, Control, and the Distribution of Earnings.” Bell 
J. Econ. 13 (Autumn 1982): 311-23. 

Rothschild, Michael, and Stiglitz, Joseph E. "Equilibrium in Competitive In¬ 
surance Markets: An Essay on the Economics of Imperfect Information.” 
Q.J.E. 90 (November 1976): 630-49. 



Confirmations and Contradictions 


Arbitrage during the Dollar-Sterling 
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An Econometric Approach 
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Stanford University 


I. Introduction 

The gold standard as instituted in the late nineteenth and early twen¬ 
tieth centuries has recently become the center of public policy and 
academic scrutiny. Recently a controversy over the efficiency of the 
gold standard has developed (see Clark 1984; Officer 1986). In short, 
Clark, following Morgenstern (1959), finds that the gold standard was 
inefficient because it exhibited many “violations of the gold points.” 1 
Officer, however, attempts to show that the gold standard was ef¬ 
ficient in that few gold point violations are observed when proper 
methods are used to calculate the relevant gold points. 2 

We would like to thank Truman Clark for generously providing his data on gold 
shipments, and Truman Clark, Lawrence Officer, two anonymous referees, and two 
editors of this Journal for helpful suggestions. The usual disclaimer applies. 

1 Clark also considers as tests of efficiency how rapidly gold point violations disap¬ 
peared and whether gold always moved in the profitable direction whenever a gold 
point violation occurred. He finds that violations often persisted for extended periods, 
and gold sometimes Bowed in unprofitable directions, thus concluding that the system 
was inefficient. 

2 Officer maintains that an arbitrageur would finance his gold shipments in the 
country with the lower interest rate, not in his country of residence, as Clark assumes. 
Also, Officer meticulously computes direct shipping costs from their detailed compo¬ 
nents for the entire period. Clark, on the other hand, assumes that direct costs were 
constant over time and were the same in each direction, both of which Officer’s calcula¬ 
tions refute. 
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In this paper we estimate the system’s arbitrage costs econometric- 
ally by using a framework developed in Spiller and Wood (1988). This 
framework uses only exchange rate information to estimate simulta¬ 
neously arbitrage costs and arbitrage probabilities. 3 

This methodology is based on four main assumptions: first, that 
economic agents are profit maximizers; second, that there are positive 
arbitrage costs; third, that arbitrage costs are stochastic; and finally, 
that arbitrage costs have nontrivial, unobservable (to the econometri¬ 
cian) components. Under those assumptions, gold point violations are 
nothing but the result of random (positive) shocks to arbitrage costs 
that make arbitrage more expensive than average. Since arbitrage 
costs are stochastic, gold points are stochastic as well. 

In this framework, then, two distinct cases must be analyzed. Con¬ 
sider first the level of the free-floating exchange rate. If there is no 
opportunity for gold arbitrage (i.e., the free-floating exchange rate 
falls in between the stochastic gold points), then the observed ex¬ 
change rate would actually be the free-floating exchange rate. If, on 
the other hand, the free-floating exchange rate falls outside the 
boundary of the relevant gold point (as determined by technical and 
informational transaction costs), then arbitrageurs would (immedi¬ 
ately) act in such a way to exhaust all profitable opportunities. In this 
case, we would not observe the free-floating exchange rate. Rather we 
would observe the exchange rate to be equal to the relevant gold 
point. No violation of the (relevant) gold point occurs. Gold points, 
however, are not constant, but change over time as the underlying 
transaction (or arbitrage) costs change. 

Our model consists, then, of three different stochastic processes 
occurring at the same time: one for the U.S. import gold point, a 
second for the U.S. export gold point, and a third for the free-floating 
exchange rate. We observe exactly one of these but do not know 
which, although we have data that may be used to estimate all three. 

The model just described is a two-limit stochastic Tobit model that 
estimates simultaneously transaction costs and the probability of arbi¬ 
trage in each direction. 

II. The Model 

The variables of our model are the following: X = the official dollar 
price of one ounce of “fine” gold as paid by or charged by the U.S. 

s An advantage of the proposed methodology over previous ones is the minimal data 
requirement. While previous works have required information on the actual compo¬ 
nents of transaction costs, our methodology requires only exchange rale information. 
To the degree that reliable data on components of transaction costs are available, their 
use would improve our estimation. 
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Treasury ($20,646); Y — the official pound sterling price of one 
ounce of fine gold as paid by or charged by the Bank of England 
(£4.252); 4 S t - cable currency spot exchange rate in dollars per 
pound sterling at time I; 5 T u = transaction costs per ounce of gold 
shipped into Great Britain from the United States, in pounds sterling; 
and Tst - transaction costs per ounce of gold shipped into the United 
States from Great Britain, in dollars. 6 

Following the traditional framework for arbitrage as discussed in 
Clark (1984), we find that it is profitable to import gold into Great 
Britain if and only if 

s ' > T^Y,? 

Similarly, it is profitable to import gold into the United States if and 
only if 

< X -~y~ L - ( 2 ) 

These are the two traditional gold points, which give the boundaries 
for the spot exchange rate to fall within. In our arbitrage model, if the 
free-floating exchange rate, call it S*, falls outside this gold point 
range, then the observed exchange rate, S t , will be equal to the bound¬ 
ary gold point. 

Let regime 1 be arbitrage from the United States to Great Britain 
(eq. [1]), let regime 2 be the free-floating (or no-arbitrage) regime, 
and let regime 3 be arbitrage from Great Britain to the United States 
(eq. [2]). It will be easier to work in natural logarithms, so taking logs 
of both sides of equations (1) and (2) yields 

regime 1: log^j~~J > -y 1 (3) 

(arbitrage United States to Great Britain), 

regime 3: log(-^p-j < - (4) 

* We have assumed that the two official prices were in fact used uniformly through¬ 
out the period. See, however, Officer (1986) for a discussion of this matter. 

9 Officer (1986) argues that demand bill exchange rates were historically used more 
often than cable exchange rates. We follow the lead of Morgenstem (1959) and Clark 
(1984) but acknowledge the data controversy. 

6 Our definition of transaction costs includes all direct (e.g., transportation, interest, 
and insurance) and indirect (e.g., information) costs. Since we assume that they are 
incurred at the outset of the arbitrage process (which takes up to 2 weeks to complete, 
including the transatlantic voyage), our transaction costs should be interpreted as ex¬ 
pressed in present value terms. 
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(arbitrage Great Britain to United States), 

regime 2: log^= « + v o., (5) 

(no arbitrage), where we have used the approximation that iog(l - q) 
= -q for q small. The free-floating exchange rate process is given by 
(5), where vo f/ is normally distributed with zero mean and variance cfo 
and u is a constant. 7 
Let transaction costs be given by 

T u - 7? + (6) 

T st = n + € S „ (7) 

where is a random variable derived from a truncated normal with 
mean zero, variance of, and truncation point Ki - Tf , with K, positive. 
The model’s variables are interpreted as follows: T„ is the realized 
transaction cost, which must always be positive; T* is the mode of the 
distribution of transaction costs; and Ki is the lower truncation point 
of the distribution of transaction costs. 8 
The likelihood function of this model, in general form, 9 is 

N 

L — PI (ki/n + A2/2; + A S / S/ ), (8) 

(« 1 

where A, is the probability that the observation is from regime i, and/„ 
is the value of the density function at observation t. given that it comes 
from regime ». 

Using the normality assumptions described above, we can specify 
closed-form expressions for the regime probabilities and for their 
corresponding density functions. Following Weinstein (1964), we 
have 

/ e T* \ 

Ai = prob(S t € regime 1) = probl v (W -—- > —p- u) 


= l - - “1 


(9) 


r 

J[(X,/n-uV<ro 


1 - ^F/ailooW - (TVY) + u]} 

1 - W, - rf)/o,] 


<| >(w)dw, 


7 In principle, u could be a function of explanatory variables. For simplicity, u is 
assumed herein to be a constant. 

8 We assume that Vq, and are mutually independent and identically distributed. 
The serial and cross-equation correlations would make the econometrics intractable. 

9 We will also estimate the model using a gamma and a lognormal specification for 
the distribution of the transaction costs. In the exposition of the model we will discuss 
only the truncated normal case and refer to the other models’ results where applicable. 
See the Appendix for details. 
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7t 


A s = prob($, 6 regime 3) = prob|v ()/ + ~~ < -u - 

-L 


~ u ~ (Aj/X))/<tc 


( 10 ) 


x ~ < r * /x > ~ ~ ~ A(w)d 

1 - <D[(A: S - 7t)/<r»] W ’ 

where 4> and <l> are the density and distribution functions of the stan¬ 
dard normal, respectively. Since the three regimes are mutually ex¬ 
clusive and exhaustive, 


A a — 1 - A] — A 3 . 


(ID 


The values of the three densities for an observed S, are easily 
derived from the distributional assumptions 


, _ KHr/gi[log(S,K/X) - (TVm 

tr.o - $[(f, - T*)!(t ( )} ’ 


h 


_ 4>{[log(^K/X) - «}/cro} 




r - ^4>{x/ff,,i-(T 3 */x) - iog(s,y/x)]} 

h cr»{l - *[(*., - r?)/a,]} ' 


( 12 ) 


(IS) 


(14) 


The complete set of eight parameters that need to be estimated is 
{u, or 0 , T*, ctj, K j, T*, ct 3 , X 3 }. In the next section we estimate this 
model, test whether it predicts arbitrage in times of gold shipments 
(and in the proper direction) as well as the instances of gold point 
violations found by Clark, and test it against a no-arbitrage alternative 
model. We conclude with a comparison of our estimates of the cost 
variables with the list of computed costs of Clark and Officer. 


III. Data and Estimation Results 

The exchange rate S t that we use in the estimations is the weekly cable 
currency spot rate for the period 1899-1908. 10 Since high and low 
weekly spreads are available, to capture all arbitrage opportunities we 
use the high rate if it is above the mint parity ratio and the low rate if 
it is below mint parity. We maximize the likelihood function given in 
equation (8) using the computer program gqopt. Table 1 provides 
the summary descriptive statistics of the data. It also provides the log 

10 The data come from Andrew (1910) and U.S. National Monetary Commission 
(1910). Only monthly data are available from 1890 to 1898. Since we have .*>20 weekly 
observations, we estimate the model using only weekly data. 
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TABLE 1 

Summary Statistics 


Net Gold Shipments 
from United States 

S, Z, to Great Britain 


Mean 

4.86965 

.00063 

- 343.954 

Standard deviation 

.012 

.002 

1,812.972 

Quartiles: 

Minimum 

4.83750 

- .00599 

-17,996.0 

First quartile 

4.86125 

-.00109 

-8.0 

Median 

4.87125 

.00096 

.0 

Third quartile 

4.87750 

.00225 

2.0 

Maximum 

4.90125 

.00710 

8,330.0 


SoukCE—A ndrew (1910); U.S. National Monetary Commission (1910). 

Note.— is the cable currency exchange rate in dollars per pound sterling; Z, - log(5,/(AVK)| ts the left-hand side 
ot eqq. (3)-(5) and determines to which regime the observation belongs, i.e., in which direction arbitrage would 
conceivably be profitable. Ciold shipments are weekly bilateral net exports from the United States to Great Britain for 
1900-1908 in thousands of dollars. For comparison, recall that the mint parity ratio X/Y is $4.80656 per pound 
sterling. There arc 50 weeks in the sample with less than $100 of net gold shipped, so the value of 0.0 is taken 


of the ratio of the exchange rate to the mint parity ratio since it is this 
unitless quantity (it is compared with transaction costs expressed as a 
percentage of the price of gold) that appears on the left-hand side of 
equations (3)-(5), determining to which regime the observation most 
likely belongs. 

Table 2 presents the estimates of the distribution of transaction 
costs for the truncated normal, gamma, and lognormal distributional 
assumptions. Since we are using truncated distributions, we have pre¬ 
sented four variables of interest for each distribution: mean, standard 
deviation, mode, and minimum. The underlying parameter estimates 
and asymptotic standard errors are presented in table 3. Table 2 also 
reports the parameter estimates for Uj = 7’j IY and v 3 = T 3 /X, which 
are the unitless quantities needed to compare across currencies (they 
are interpreted as the percentage transaction cost) from equations 
(3)-(5). Ex ante arbitrage probabilities are also given in this table. 

The results from the three distributional models are quite consis¬ 
tent. The only significant difference is that the truncation points in 
the truncated normal model appear to be different from zero, which 
is imposed a priori in both the gamma and lognormal models. The ex 
ante arbitrage probabilities range from 17 to 23 percent into Great 
Britain and from 5 to 11 percent into the United States. The ex ante 
no-arbitrage probability ranges from 70 to 75 percent. 

The means of the transaction costs are also quite consistent: E(T 
ranges from £0.010 to £0.012 per ounce of gold shipped into Great 
Britain, and £(T 3 ) ranges from $0,058 to $0,078 per ounce of gold 
shipped in the opposite direction. When these are converted to per¬ 
centages of official mint prices in the two destination countries, E(v j) 
ranges from 0.23 to 0.25 percent for Great Britain and E(v 3 ) ranges 





888 


JOURNAL OF POUTICAL ECONOMY 


TABLE 2 

Transaction Cost Distributions 



Truncated 

Normal 

Gamma 

Lognormal 

£cr,) 

.0124 

.0103 

.0108 

°fT,) 

.0050 

.0056 

.0052 

Modc(Ti) 

.0063 

.0071 

.0079 

Min(Tj) 

.0063 

.0000 

.0000 

£<'/*) 

.0578 

.0698 

.0783 

»<r,) 

.0538 

.0319 

.0271 

Mode(T,) 

.0350 

.0555 

.0661 

Min(r } ) 

.0014 

.0000 

.0000 

£(v,) 

.0029 

.0024 

.0025 

a(v,) 

.0012 

.0013 

.0012 

) 

.0015 

.0017 

.0019 

Min(f]) 

.0015 

.0000 

.0000 

E(v,) 

.0028 

.0034 

.0038 

or(i/ s ) 

.0026 

.0015 

.0013 

Mode(tig) 

.0017 

.0027 

.0032 

Min(t'y) 

.0001 

.0000 

.0000 

X, 

.168 

.227 

.204 


(.030) 

(.024) 

(.033) 

*2 

.728 

.699 

.751 


(.042) 

(.032) 

(.039) 

^3 

.104 

.074 

.045 


(.020) 

(016) 

(-017) 


Noth— 7"i is in putmds sterling per ounce of gold shipped; Tj is in dollars per ounce of gold shipped, U| - T\IY. 
ns ” T t IX. vi is interpreted as the percentage transaction cost. is the e* ante probability of arbitrage troin die 
United Slates to Great Britain (associated with regime 1 and T,); k 8 is the ex ante autarky probability associated with 
regime 2; is the ex ante probability of arbitrage from Great Britain to the United States (associated with regime 3 
and T}) Asymptotic standard errors of X, are given in parentheses. 


from 0.28 to 0.38 percent for the United States. Clark uses 0.3125 
percent as the estimate of shipping costs in both directions, which, 
when added to the forgone interest income, is approximately 0.42 
percent. Officer reports an average value of 0.61 percent, which in¬ 
cludes an additional risk premium. While our estimated means are 
closer to Clark’s, they are not so different from Officer’s when ac¬ 
count is taken of arbitrage costs’ estimated standard deviations. 

Our results show that arbitrage costs in both directions are very 
volatile. Their estimated standard deviations are approximately 50 
percent of their estimated means. Thus there are instances in which 
the relevant transaction costs may have exceeded 0.5 percent of price. 
Many instances of alleged gold point violations identified by previous 
authors, then, may have been nothing more than instances in which 
arbitrage costs may have been larger than average. The volatility of 
the free-floating regime can be compared with the distribution of 
transaction costs. In principle, the larger the volatility of the free- 
floating regime and the lower the transaction costs, the higher the 
probability of triggering arbitrage. Our finding of relatively low arbi- 
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Maximum Likelihood Estimates 



Truncated 

Normal 

Gamma 

Lognormal 

Autarky 

u 

.0006 

.0005 

.0004 

.0006 


(.0001) 

(.0001) 

(.0001) 

(.0001) 

<Tq 

.0021 

.0022 

.0022 

.0024 

(.0001) 

(.0001) 

(.0001) 

(.0001) 

rr 

.0005 

(.0265) 

... 



Oi 

.0099 

(.0082) 




Ki 

.0063 

(.0001) 

. . . 



Tt 

.0350 

(.0148) 




ff* 

.0521 

(.0164) 

... 



As 

.0014 

(.0095) 


* * * 


Oil 


3.2901 

(.0179) 

.0097 

(.0012) 


p. 


.0032 

(.0002) 

.4575 

(.0880) 


04 


4.8209 

(.0241) 

.0741 

(.0107) 


Ps 

... 

.0145 

(.0018) 

.3359 

(.0960) 


Log L 

2,401.86 

2,402.71 

2,401.03 

2,393.37 


Note. —Point estimates are given with associated standard errors in parentheses Number ot observations is 520. 
See the text for the truncated normal structural model and the Appendix for the gamma and lognormal models. 


trage probabilities is consistent with the results of table 3, where the 
standard deviation of the free-floating exchange rate is estimated to 
be less than the mean of transaction costs (compare a 0 in table 3 with 
E(Vi) in table 2). Thus arbitrage seems to have been triggered only 
when a large shock to local demand and supply was combined with a 
downward transaction cost shock. Finally, average arbitrage costs 
from Great Britain to the United States seem to exceed those of the 
opposite direction. 11 


IV. Further Findings and a Test of the Model 

Our methodology also provides the posterior arbitrage probabilities 
for each observation. Keifer (1980) shows that an observation’s poste¬ 
rior probability of being in each regime is given by 


frn - - Jtf"- , * - 1.2,3, n = 1,2. N, 

11 This result, however, differs from Officer's computations. 


( 15 ) 
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TABLE 4 

Dependent Variable: Net Gold Shipments 



\ 

2 

ARBIPROB 

1.456.95 

1,513.15 


(177.69) 

(197.09) 

PANICDUM 

- 3,686.34 



(352.23) 


PANARB 

-14,820.94 



(1,138.48) 


Constant 

-246.89 

- 355.35 


(60.42) 

(66.05) 

Multiple R~ 

.716 

.631 


Non — ARBIPROB •* </y — the posterior arbitrage probability for the truncated 
normal model; PAN'ICDl'M * 1 if during panic of J9D7, 0 otherwise; PANARB » 
PANIC-Ul'M * ARBIPROB. l)cj>en*ic»! variable measures weekly net exports of the 
United States to Great Britain from 1900 to 1908 in thousands of dollars. Number of 
observations is 472. Standard errors are in parentheses. 


where q m is the posterior probability that observation n belongs to 
regime i; X, is the ex ante probability of regime i, given in equations 
(9)-(l 1); and f„ is the value of density of regime i for observation n, 
given in equations (12)—(14). 

A heuristic test of our methodology consists in comparing the pos¬ 
terior arbitrage probabilities with actual gold shipments. The gold 
data, summarized in table 1 , are shipments only between the two 
countries . 12 We regress net gold shipments on our calculated poste¬ 
rior arbitrage probabilities. 1 ’We would expect that when the proba¬ 
bility of arbitrage from Great Britain to the United States is high, then 
gold will flow in that direction in large amounts, and vice versa. We 
use time of departure of the shipment to avoid the complication of 
shipping travel time. Since our model is one of immediate arbitrage, 
we regress contemporary variables. Regression results are presented 
in table 4. Two different regressions are reported. Column 1 presents 
the results of regressing net gold shipments on the posterior arbitrage 
probability, a dummy variable for the panic of 1907, and its product 
with arbitrage probability . 14 Column 2 includes only a constant and 
the posterior probability. We find that the regression coefficient on 
the arbitrage probability (qi - q$) is quite significant and positive in 
both cases. Thus gold shipments and arbitrage probabilities seem to 
be highly correlated. 

Significance tests for the arbitrage probabilities (\i and X 3 ) are 

12 See Clark (1984) for a discussion of this assumption. 

15 In order to capture directional arbitrage we define the variable ARBIPROB to be 
q x - g s , so that if ARBIPROB = 1 {- I) then the posterior probability of arbitrage 
from the United States to Great Britain (from Great Britain to the United States) is 1. 

1,1 It is expected that the relationship between gold shipments and posterior arbitrage 
probabilities should change during the panic of 1907 (16 weeks from October 1907 
until January 1908). 
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difficult to develop. We present in table 2 asymptotic standard errors 
of the probabilities, using the standard asymptotic theory. However, 
the asymptotics of our model do not satisfy the standard assumptions 
since probabilities are necessarily nonnegative, and thus the test for A, 
= 0 should take that into account (see Gouridroux, Holly, and Mon¬ 
fort 1982). Furthermore, under the null hypothesis that A, = 0, none 
of the transaction cost parameters associated with that regime can be 
estimated, violating one of the standard regularity conditions needed 
to derive the asymptotic distributions of traditional likelihood ratio or 
Wald test statistics (see Davies 1987). A rigorous test for cases in which 
both types of problems are present is yet to be developed. Thus the 
standard errors of the probabilities presented in table 2 should be 
interpreted with caution. 

V. Final Comments 

In this paper we make use of an econometric methodology to estimate 
transaction costs and arbitrage opportunities simultaneously during 
the 1899-1908 gold standard period in the United States and Great 
Britain. We estimate a generalized two-limit Tobit model with sto¬ 
chastic limits. We obtain transaction cost estimates that are consistent 
with figures presented by Clark and Officer. We provide evidence 
that arbitrage costs were, as suggested by Officer, very volatile. We 
find that arbitrage from the United States to Great Britain occurred 
with a probability of roughly 20 percent. Arbitrage in the opposite 
direction appears to have occurred roughly 10 percent of the time. 
Regressions were performed that seem to indicate that gold move¬ 
ments were triggered, at least partially, by arbitrage opportunities. 

Our results provide new evidence of the relative efficiency of the 
gold standard, thereby confirming Officer’s conclusions, even though 
our methodology is quite different from his. Average transaction 
costs were relatively small, and furthermore, arbitrage was not trig¬ 
gered very often. Thus the gold standard seems to have been a rela¬ 
tively efficient system to provide bounds to exchange rate movements. 


Appendix 

Alternative Transaction Costs Distributions 

A. Gamma 

- gamma(a„ ft; 7* > 0), 

k ' * C [' - *( 2 sr)] wfcy <y ”' r ' ‘ HttK 



892 


JOURNAL OF POLITICAL ECONOMY 


/. - 


Y 


Y ^t)]" * ex p[~ 




5. Lognormal 


T u — k)gnormal(a,, 0,; > 0), 

which comes from 


log(7„) ~ normal (log 0?). 




/• = 


K/V2ir 


0|E log(KS,/X) 


' Xp (lp5 




The formulas for the ex ante probabilities and densities for the other regimes 
are quite straightforward and are omitted. 
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How Big Is the Random Walk in GNP? 


John H. Cochrane 

University of Chicago 


This paper presents a measure of the persistence of fluctuations in 
GNP based on the variance of its long differences. That measure 
finds litde long-term persistence in GNP. Previous research on this 
question found a great deal of persistence in GNP, suggesting mod¬ 
els such as a random walk. A reconciliation of this paper’s results 
with previous research shows that conventional criteria for time- 
series model building can produce misleading estimates of per¬ 
sistence. 


I. Introduction 

Macroeconomists once viewed fluctuations in gross national product 
as temporary deviations from a trend. The economic theory of busi¬ 
ness cycles described temporary deviations from “potential GNP,” 
which was assumed to evolve smoothly over time, and data were 
routinely detrended prior to analysis. A body of recent empirical 
work (described below) has questioned this time-honored view. By 
using a variety of time-series models, it finds that fluctuations in GNP 
are permanent—that a decline in GNP today lowers forecasts of GNP 
into the infinite future. 

This paper reexamines the long-run properties of GNP and argues 
that GNP does, in fact, revert toward a “trend” following a shock. 
However, that reversion occurs over a time horizon characteristic of 
business cycles—several years at least. Therefore, the short-run prop¬ 
erties of GNP are consistent with a model with very persistent shocks, 

1 thank Eugene Fama, Lars Hansen, John Huizinga, Robert Lucas, James Stock, 
Robert Shiller, an anonymous referee, and the editors of this Journal for many helpful 
comments and suggestions. 
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and one can incorrectly infer a great deal of long-horizon persistence 
by fitting a time-series model to this short-run behavior. 

The class of time-series model most commonly used to describe 
temporary deviations about trend is 

00 

* = bt + Yj V*-r 0) 

>- 0 

where y, stands for log GNP, bt describes the trend, and t, is a random 
disturbance. 1 Fluctuations in y, are temporary if tojtt-j is a stationary 
stochastic process (y, is then called “trend stationary”). For 2a,e,_ 7 to 
be stationary, the aj must approach zero for large j. As a result, a 
decline in GNP below trend today has no effect on forecasts of the 
level of GNP, E,(y,+j), in the far future, and it implies that growth 
rates of GNP must rise above their historical average for a few periods 
until the trend line is reestablished. 

The simplest time-series model that captures permanent fluctua¬ 
tions in GNP is a random walk with drift: 

y, = p- + 1 + «/■ (2) 

Fluctuations in a random walk are permanent in the following sense: 
suppose that « t = -1, so that y t falls one unit below last period’s 
expected value. Then, since y t+J = y, + j\i + \ + ... + t (+; , 

forecasts E,(y t+ j) fall by one unit for the indefinite future. Also, a low 
or negative growth rate today implies nothing about growth rates in 
the future, and there is no tendency for future levels of GNP to revert 
to a trend line. The random walk is also nonstationary. 

The distinction between a random walk (2) and a trend-stationary 
series (1) is extreme. Long-range forecasts of a random walk move 
one for one with shocks at each date, while long-range forecasts of a 
trend-stationary series do not change at all. There are two related 
ways to think about a series that lies between these two extremes. 

First, one can ask how much long-term forecasts respond to shocks. 
In one interpretation, the measure of this paper asks the question. 
How much does a one-unit shock to GNP affect forecasts in the far 
future? If by one unit, it finds a random walk; if by zero, it finds a 
trend-stationary process like (1). It can also find numbers between 
zero and one, characterizing a series that returns toward a “trend” in 
the far future, but does not get all the way there, or it can find a 
number greater than one, characterizing a series that will continue to 

1 Simple univariate time-series models like ( 1 ) should be thought of as a way of 
capturing the dynamic behavior of y, that results from a rich multivariate world. They 
are not "structural" in any way. 
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diverge from its previously forecast value following a shock. Campbell 
and Mankiw (1987) originated and emphasize this interpretation. 

Second, one can model a series whose fluctuations are partly tem¬ 
porary and partly permanent as a combination of a stationary series 
and a random walk. The random walk carries the permanent part of a 
change and the stationary series carries the temporary part of a 
change. Then, one can ask how important the permanent or random 
walk component is to the behavior of the series. In a second interpre¬ 
tation, the measure of this paper asks the question, How large is the 
variance of shocks to the random walk or permanent component of 
GNP compared with the variance of yearly GNP growth rates? Or, 
equivalently, How big is the random walk in GNP? 

If the variance of the shocks to the random walk component is zero, 
the series is trend-stationary, and long-term forecasts do not change 
in response to shocks. If the variance of the shocks to the random 
walk component is equal to the variance of first differences, the series 
is a pure random walk. As before, there is a continuous range of 
possibilities between zero and one and beyond one. 

A model consisting of a random walk plus a stationary component 
may seem quite special. However, I show below that we can think of 
any series whose growth rates or first differences are stationary (any 
series with a unit root) as a combination of a stationary series plus a 
random walk. The decomposition into stationary and random walk 
components is a convenient way of thinking about the properties of a 
time series, but it adds no structure. I also show that the response to 
innovations is proportional to the square root of the variance of 
shocks to a random walk component, so we can freely transform 
between these two interpretations. 

The idea that GNP may contain a random walk goes back to Irving 
Fisher’s "Monte Carlo hypothesis,” examined by McCulloch (1975). 
There is now a large literature following the first half of Nelson and 
Plosser (1982) that applies the Dickey and Fuller (1979, 1981) and 
subsequent tests for unit roots to aggregate time series. Since a series 
with a unit root is equivalent to a series that is composed of a random 
walk and a stationary component, tests for a unit root are attempts to 
distinguish between series that have no random walk component (or 
for which the variance of shocks to the random walk component is 
zero) and series that have a random walk component (or for which the 
variance of shocks to the random walk component is between zero 
and infinity). Stated this way, it is clear why tests for a unit root have 
low power: it is hard to tell a stationary series from a stationary series 
plus a very small random walk. This paper and the related literature 
cited in it go beyond testing for the presence or absence of a unit root 
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or random walk component and measure how important the unit root 
or random walk component is to the behavior of a series. 


Implications of the Random Walk in GNP 

The size of a random walk in GNP is important from a purely statisti¬ 
cal viewpoint. Many statistical procedures rely critically on the distinc¬ 
tion between series that do not contain a random walk component (1), 
which we can and should detrend, and first-difference stationary 
series—(3) below, or series that do contain a random walk compo¬ 
nent—which we should first-difference prior to analysis. Hypothesis 
tests that rely on asymptotic distribution theory are an important 
example because that distribution theory is often quite sensitive to the 
presence of a random walk component. A measurement of the size of 
the random walk component can be a better guide to the proper 
procedure than a unit root test because if the random walk compo¬ 
nent is small but still nonzero, then an asymptotic distribution theory 
based on trend stationarity may provide a better approximation in a 
given small sample than the theory based on a unit root. 

The size of a random walk in GNP has been cast as a direct test 
between competing models of the economy. For example. Nelson and 
Plosser (1982) interpreted their result that GNP has a large random 
walk component as evidence for stochastic equilibrium models over 
traditional monetary or Keynesian business cycle models. They ar¬ 
gued that traditional models produce only temporary deviations from 
trend, while models that find the ultimate source of GNP variability in 
technology shocks can produce permanent fluctuations. 

With the advantages of hindsight, it now seems that the size or 
existence of a random walk component in GNP cannot directly distin¬ 
guish broad classes of economic theories of the business cycle at their 
present stage of development. The Kydland and Prescott (1982) and 
Long and Plosser (1983) stochastic equilibrium models were con¬ 
structed precisely to generate temporary fluctuations about trend. On 
the other hand. King et al. (1987) show that one can modify these 
models to produce a random walk component by introducing a ran¬ 
dom walk in the technology shocks or a linear technology for human 
or physical capital accumulation. Presumably, the same modifica¬ 
tions would introduce a random walk component into monetary or 
“Keynesian” models as well. 

Furthermore, the results of this paper are compatible with a variety 
of random walk components. I show below that an AR(2) about a 
deterministic trend, which has no random walk component, and a 
model with a random walk whose variance is 0.18 times the variance 
of first differences of log GNP account equally well for the results of 
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this paper. Also, the standard errors in this paper are large, and 1 
argue that this is unavoidable. I conclude that the existence or size of 
a random walk component in GNP is not a precisely measured “styl¬ 
ized fact" that we should require any reasonable model to reproduce. 

The most promising direct use for the point estimates of the size of 
a random walk component in this paper may be the calibration of a 
given model rather than a test that can distinguish competing classes 
of models. If a model (like the ones cited above) produces a random 
walk in GNP, the results of this paper suggest that the parameters of 
that model should be picked to also generate interesting short-run 
dynamics of GNP, so that the variance of yearly changes in GNP is 
much larger than the variance of shocks to its random walk compo¬ 
nent. 


Other Estimates 

Several authors have estimated the persistence of fluctuations in 
GNP, and their estimates vary greatly. Nelson and Plosser (1982) 
matched a model consisting of permanent and temporary compo¬ 
nents to a stylized autocorrelation function for growth rates of GNP 
and concluded that the permanent component was more important 
than the temporary component. Watson (1986) and Clark (1987) esti¬ 
mated different unobserved components models and found a small 
permanent component. Campbell and Mankiw (1987) estimated the 
effect of a shock on long-term forecasts of GNP from the parameters 
of low-order autoregressive, moving average (ARMA) representa¬ 
tions of postwar GNP and found a large random walk component. 

Several authors have examined the persistence of fluctuations in 
other time series using a variety of methods. Rose (1986) presents a 
survey of papers that find large random walk components in various 
macroeconomic time series. In finance, conventional wisdom favored 
the random walk model while macroeconomists favored the trend¬ 
stationary model. Poterba and Summers (1987), Fama and French 
(1988), and Lo and MacKinlay (1988) use variance ratio estimators 
similar to the one used in this paper and related estimators to docu¬ 
ment a temporary component in stock prices. Huizinga (1987) uses a 
closely related estimator to document a temporary component in real 
exchange rates. Cochrane and Sbordone (1988) present a multivar¬ 
iate extension. 

This Paper’s Technique 

In this paper, I measure the size of a random walk component in GNP 
from the variance of its long differences. The intuition behind this 
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measure comes from the following argument: Imagine that log GNP, 
denoted?,, is a pure random walk (model [2J). Then the variance of its 
A-differences grows linearly with the difference A: var(y, - ?,_*) a 
Act?. On the other hand, if log GNP is stationary about a trend (model 
[1]), the variance of its A-differences approaches a constant, twice the 
unconditional variance of the series: var(y, - ?,_*) -* 2<r|. Now plot 
(1/A)var(y, - ?,_*) as a function of k. If y, is a random walk, the plot 
should be constant at a*. If y, is trend-stationary, the plot should 
decline toward zero. 

Next, suppose that fluctuations in GNP are partly permanent and 
partly temporary, which we can model as a combination of a station¬ 
ary series and a random walk. Now the plot of (1/A)var(y, - ?,_*) 
versus A should settle down to the variance of the shock to the random 
walk component. 

If fluctuations in GNP are partly temporary—if the random walk 
component is small and a shock today will be partially reversed in the 
long run—that reversal is likely to be slow, loosely structured, and not 
easily captured in a simple parametric model. The variance of A- 
differences can find such loosely structured reversion, whereas many 
other approaches cannot. I show in Section IV that this difference can 
reconcile the results of this paper with other measures of the perma¬ 
nence of fluctuations in GNP. 


Results 

Figure 1 and table 1 present (1/A)var(y, - ?,_*) for log real per capita 
GNP, 1869-1986. Pre-1939 data are taken from Friedman and 
Schwartz (1982). I use real per capita GNP to eliminate possible non- 
stationarity induced by inflation or population growth. (Henceforth, 1 
will refer to log real per capita GNP as just “GNP.”) Figure 1 and table 
1 also include asymptotic standard errors, discussed below. Table 1 
also presents 1/A times the variance of A-differences divided by the 
variance of first differences (the variance ratio). The units in table 1 
and figure 1 are annual percentage growth. 

Since 1/A times the variance of A-differences settles down to about 
one-third of the variance of first differences, figure 1 and table 1 
suggest that the innovation variance of the random walk component 
is about one-third of the variance of year-to-year changes: annual 
growth rates of GNP contain a large temporary component. In fact, 1 
show below that the pattern of figure 1 is consistent with a determin¬ 
istic trend, which has no permanent or random walk component, and 
whose fluctuations are entirely temporary. 

Figure 2 presents the log of real per capita GNP. Notice that this 
data set looks as if it has a trend in it. Fluctuations occur, but the level 
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Fig. I.— 1/A times the variance of A-diffcrences of log real per capita GNP, 1869- 
1986, with asymptotic standard errors. 


of the series always returns to the “trend line.” Furthermore, that 
trend line is linear: there are no “waves” of low-frequency movement. 
These characteristics drive the finding of a small random walk com¬ 
ponent. (Note that low-frequency movement generated by a non¬ 
linear trend, a shift, etc. would show up as a large random walk 
component in this and most other estimation techniques based on 
linear time-series models.) 

Prewar GNP data are more variable than postwar data, and one 
might suspect that this characteristic drives the result. However, 
figure 3 and table 1 present 1 Ik times the variance of A-differences for 
postwar GNP, and the same pattern is evident. Both the variance of 
first differences and the variance of the random walk component are 
lower, but their proportions do not change much. 2 

a The pattern of fig. 2 is sensitive to the precise specification of the variables. First, 
the variance of quarterly differences of seasonally adjusted GNP is less than one-fourth 
the variance of yearly differences, so the variance ratio is higher if one uses quarterly 
rather than annual differences in the denominator. This observation explains most of 
the difference between fig. 2 and the results reported by Campbell and Mankiw (1988). 
who use a similar technique on quarterly data. Second, taking the variance of overlap¬ 
ping A-year differences of quarterly data vs. the variance of A-year differences of annual 
averages, including or excluding population growth, taking logs or not, and even 
changing the sample by a few years can all change the variance ratio by about one 
standard error. 






Oh 

X 

o 

ib 

O 

12 

u 

z 

i 

1 

a 


CO ‘ W 90 


^ © 
floo«f>r 
© ’ ""''cO 


^vO I s - 
© *o ®o *-; 
on © -^90 


i! 

CQ 

< 

H 


s 

H 


X 

H 


tO 


t 

r* I s - 


O' 

~h CM 


x-vOO CM _ 
j jfi NflO «0 © 
CM CM ' ° 


^CM © 

© —* cr> r* 


^ _ 
CM © 1^* CM OO 

oo © 0in 

CM 


‘S2 JO 

CM Tf O CM CO 

© © ■— 

CO ‘ w 


^on ^ 
00 *-«*-* CM © 
CM © 

*r '-" 


CM W 


Tf cm 


>n ® S <q 

(O <N —'(N 


/-*nO C> 

OO GO © OO 
t^ CM — ^ CM 


t- 

© © °0 CO 

»«>■ ~" esl 


« si -; ^ ® 
oo esi •“ —' ** 


o«8“^ 

t~; - - ■— c^ 


S v 


<b 

M-H *« 

« -b 


1* 
ip 
i •? 

i 


9 °° 





goi 







902 JOURNAL OF POLITICAL ECONOMY 

Romer (1986) argued that prewar GNP data overstate the actual 
cyclical variability of GNP. This possibility will not bias the estimate of 
the variance of the random walk component. Taking ^-differences 
acts as a filter that ignores cyclical fluctuations and concentrates on 
the variability of longer “runs," so a different GNP data set will have a 
different variance of A-differences if the early GNP has a significantly 
different and more variable trend line, not if its cyclical fluctuations 
are different. A graph similar to figure 1, using Romer’s adjusted 
early GNP series, produces a variance of a random walk component 
very similar to that of figure 1. It should because Romer kept the 
decade trends the same in her corrections for cyclical volatility. Her 
criticism, or the seasonal adjustment of quarterly data, will affect the 
variance of first differences, so the variance ratio can be biased by 
excessive volatility or smoothness of the first differences. 

The presence of a splice in 1947 also does not drive the result. 
Every long series of GNP data contains at least one splice. The wide 
surveys used to construct later data are simply not available for earlier 
periods, so some projection using a restricted set of industries is un¬ 
avoidable. However, forcing the levels of the “old” and “new” GNP 
series to match at a certain date does not bias the variance of A- 
differences. It is biased only if the old series has different growth 
rates over long horizons. 

The body of this paper consists of an investigation of I Ik times the 
variance of A-difFerences as an estimate of the random walk compo¬ 
nent in GNP. Section II provides several interpretations of a random 
walk component. Section III discusses estimation. Section IV recon¬ 
ciles these results with previous research that found a large random 
walk component by showing how conventional time-series estimation 
techniques can provide misleading estimates of a random walk com¬ 
ponent. Section V contains a summary and concluding remarks. 

II. Unit Roots and Random Walk Components 

This section discusses and documents several claims in the Introduc¬ 
tion about the representation of time series. It shows that first- 
difference stationary time series or time series with a unit root are 
equivalent to time series that are composed of a stationary and a 
random walk component. It argues that the variance of shocks to a 
random walk component is just a convenient interpretation of the 
parameters of an arbitrary first-difference stationary series, but it 
requires no additional structure. It shows how to transform between 
the variance of a random walk component and the response of long¬ 
term forecasts to a shock. 
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Assume that log GNP follows a first-difference stationary linear 
process; that is, growth rates of GNP are stationary. In this case, log 
GNP has a moving average representation of the form 

op 

Ay* ■ (1 - L)y t = p. + A(L)t, = p. + (3) 

>-0 

which I take as the starting point; L is the lag operator, Ly, = The 
first equality defines the notation Ay, and (1 - L)y, for first differences 
of y,. The last equality defines the lag polynomial notation A (L). The e, 
are independent identically distributed (i.i.d.) error terms with com¬ 
mon variance a*. 

The random walk process (2) obviously has a representation of the 
form (3). The trend-stationary process (1) is a limiting case of (3): if p. 
= b and if the lag polynomial A(L) in (3) has a unit root—that is, we 
can express A(L) - (1 - L)B(L )—we recover (1) by canceling the 
terms (1 - L). Many unobserved components models are first- 
difference stationary and hence have a representation (3). Nelson and 
Plosser (1982) and Watson (1986) are examples. On the other hand, 
(3) does not include nonlinear processes such as Quah (1986), a pro¬ 
cess with a nonlinear trend, or second-difference stationary processes 
(the growth rates of GNP follow a random walk) as in Clark (1987). 

Given the representation (3), we have the following fact. 

Fact 1. Any first-difference stationary processes can be repre¬ 
sented as the sum of stationary and random walk components. 

To show that a representation as stationary plus random walk com¬ 
ponents exists, we simply construct it from the representation (3). 
This decomposition comes from Beveridge and Nelson (1981). Let 

y, = z, + c„ (4) 

where 

00 

“ +1 - + (£; 

00 00 00 

- (14, + (I-,)*.-' + (X%)«•-. - • ■ • • 

This decomposition is constructed so that lim*_» £,y,+* = z, + Ap.; 
that is, long-term forecasts of y, converge to z, plus k\z. In this sense, 
z, is the permanent component of y,. Beveridge and Nelson call it a 
stochastic trend. Long-term forecasts of y, are unaffected by c„ the 
temporary component. 

The innovation variance of the random walk component <r% z is a 
natural measure of the importance of the random walk component. 
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From the definition (4) we can write the variance of the random walk 
component in terms of the moving average representation (3): 

<tL = (Zayfa* - |A(1)| 2 <t* (5) 

(sums without indices run from zero to infinity). 

In the Beveridge and Nelson decomposition (4), the innovations in 
the random walk and stationary components are identical. In a more 
general combination of random walk and stationary components, the 
innovations may be correlated: 

y t - 2 , + c„ 

Z t = \L + Z,_| + T|„ (6) 

c, = B(L) 8 t , E(i] t 8,) arbitrary. 

If we start with a process (6), Ay, is stationary, and so the process has a 
representation of the form (3). Most processes of the form (3) can be 
decomposed into a variety of processes (6), with varying correlation 
between the innovations; but only the decomposition (4) is guaranteed 
to exist. 3 

Since a variety of decompositions into stationary and random walk 
components of the form (6) exist for any given stationary process (3), 
a measure based on the variance of the random walk component 
would be in serious trouble if it depended crucially on which arbitrary 
decomposition we choose. Fortunately, it does not, as seen in the 
following fact. 

Fact 2. In every decomposition of a process (1) into stationary and 
random walk components (6), the innovation variance of the random 
walk component is the same: - (2a,) 2 a- 2 . 

To show fact 2, start with an arbitrary decomposition (6). The cor¬ 
responding moving average representation of the form (3) is 

(1 - L)y, = ft + v, + (I - L)B(L)8, = p, + A(L)t,. (7) 

The last equality defines the parameters A(L) of a moving average 
representation from the parameters B(L) of (6). Now form the 
Beveridge and Nelson decomposition of both sides of the last equality 
in (7). Since the processes on both sides of the last equality are the 
same, they must have the same variance of a random walk compo¬ 
nent, so we must have 4 |i4(l)| 2 tr 2 = cr 2 . The correlation between v, and 

* Watson (1986) derives chi* fact. For example, if we seek a representation with 
uncorrelated innovations, the spectral density of the combination can be no less than 
the spectral density of each component; thus such a representation exists only if the 
spectral density of the first differences has a global minimum at zero. 

* This statement can be more compactly derived by noting that for the processes on 
each side of the last equality in (7) to be the same, their spectral densities must be the 
same at all frequencies, and zero in particular. 
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6, is irrelevant for this argument, so the innovation variance of every 
decomposition (6) of the same moving average representation (3) 
must have the same variance of shocks to the random walk compo¬ 
nent. This argument demonstrates fact 2. 

There is one more interpretation, which will be useful in the next 
section. The spectral density* of 6y t is, by (1), S Ay (e~ k> ) = |i4(e _iw )| a tr|. 
Therefore, we have the following fact. 

Fact 3. The innovation variance of the random walk component is 
equal to the spectral density of Ay, at frequency zero, that is, 

trL = (So>) 2 tr? = Sty(e~ i0 )cr* 
or, dividing by the variance of first differences, 

aj z _ (So ,) 2 _ S^ y (e-‘°) 

<r% y %aj a%y 

Equations (8) and (8') summarize three equivalent ways of looking 
at the long-run properties of a series: we can break it into permanent 
(random walk) and temporary (stationary) components, we can exam¬ 
ine the response of long-term forecasts to an innovation, or we can 
examine the spectral density at frequency zero of its first differences. 
All three interpretations allow us to think of the permanence of the 
fluctuations in a series as a continuous phenomenon rather than a 
discrete choice. Furthermore, equations (8) and (8') show that the 
quantity a‘i, or a\ja\ y defined from the Beveridge and Nelson de¬ 
composition (3) is no more than a useful interpretation of the sum of 
the moving average coefficients So,. The decomposition into station¬ 
ary and random walk components adds no structure. 

The variance of shocks to the random walk component or spectral 
density at frequency zero of first differences also captures all the 
effects of a unit root on the behavior of a series in a finite sample. As a 
sample of T observations of a series is completely characterized by its 
T - 1 autocovariances, it is also completely characterized by T - 1 
periodogram ordinates. By changing the periodogram ordinate at 
frequency zero of first differences without changing the others, we 
can make a stationary series into a series with a unit root or random 
walk component and vice versa. 6 

Since the size of a random walk component is a continuous choice, 
any test for trend stationarity (txL = 0 or S A> (c~ <0 ) = 0) must have 
arbitrarily low power against the alternative of a small enough ran- 

5 I use the notation S(r"“) for the spectral density at frequency w and, hence, S(S°) 
for the spectral density at w * 0. 

* With an infinite sample, or in population, this proposition does not hold. The 
spectral density is defined only almost everywhere; and in some cases we can bound the 
variation of the population spectral density function with very weak assumptions. 


( 8 ) 

( 8 ’) 
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dom walk component oi,. As a result, efforts to categorize series as 
trend-stationary or difference-stationary and read great things into 
the difference between the two will not be very fruitful. 

III. Estimation 

I claimed in the Introduction that the variance of A-differences could 
be used to estimate the innovation variance of a random walk com¬ 
ponent. To document that claim and to provide standard errors, 
this section discusses the statistical properties of the variance of in¬ 
differences. 

Asymptotic Properties 

Let of denote 1 Ik times the population variance of A-differences of 
y„ <r* = A~ 1 var(y, - >,_*); o* is related to the autocorrelation coeffi¬ 
cients of A y, by 



where oi, = var(y t - y,~i) and p } - cov(Ay,Ay,_ 7 )/cri,. The derivation 
is straightforward but tedious, so it is presented in the Appendix. 
Equation (9) shows that the limit of a* is indeed the innovation vari¬ 
ance of the random walk component: 

QQ 

lim of - (l + 2 p,) oi, = S a ,(e~ ,M ) = oI t . (10) 

The second equality is the definition of spectral density, while the 
third is reproduced from equation (8). 

Equation (9) suggests that we could also estimate 1/A times the 
variance of A-differences by using sample autocorrelations 0, in the 
place of their population values p,. (Huizinga [1987] and Campbell 
and Mankiw [1988] perform the calculation this way.) The right-hand 
side of (9) with in place of p, is the definition of the Bartlett es¬ 
timator of the spectral density at frequency zero (Anderson 1971, p. 
511). Hence, 1/A times the variance of A-differences is asymptotically 
equivalent to the Bartlett estimator. 7 

7 l/A times the variance of A-differences and the conventional Bartlett estimate are 
not identical in small samples. The estimates of sample autocorrelations implied by the 
sample variance of A-differences underweight observations A dates away from the end¬ 
points, compared with the usual estimates of autocorrelation. The difference disap¬ 
pears asymptotically but may be important in small samples. Also, the conventional 
Bartlett estimate is not unbiased in small samples, as the corrected 1/A times the vari¬ 
ance of A-differences &* is for a random walk. 1 thank John Huizinga for poindng this 
out. 
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The properties of the Bartlett estimator are well known, so we can 
establish the asymptotic properties of 1 Ik times the variance of in¬ 
differences by reference to those of the Bartlett estimator. In particu¬ 
lar, (1) if kIT —» 0 as T —> 00 , where T is the sample size, 1/A times the 
sample variance of A-differences is a consistent estimate of the spec¬ 
tral density at frequency zero; (2) the asymptotic variance of a* is 
4kS 2 (e~ i0 )/3T (Anderson 1971, p. 531). 

The equivalence between 1/A times the variance of A-differences 
and the Bartlett estimator provides a useful interpretation of the 
variance of A-differences for readers familiar with spectral density 
estimation; in turn, the variance of A-differences is a useful and intui¬ 
tive time domain counterpart to the Bartlett spectral density es¬ 
timator. To use the Bartlett estimator, we have to decide what A to 
use: how many autocovariances or autocorrelations to include in (9) 
or how many periodogram ordinates to smooth. The choice of A re¬ 
quires a trade-off between bias and efficiency, and it is usually made 
arbitrarily. In this context, a plot of 1/A times the variance of A- 
differences versus A is an experimental determination of the proper A 
or window width. 


Small-Sample Properties 

n small samples, 1/A times the variance of A-differences and the Bart¬ 
lett estimator can be biased, and the asymptotic standard errors may 
be a poor approximation to the actual standard errors. In this subsec¬ 
tion, I discuss corrections for small-sample bias, and I present some 
Monte Carlo experiments to evaluate standard errors. 

I corrected for two sources of small-sample bias in the sample vari¬ 
ance of A-differences. These corrections produce an estimator of a* 
that is unbiased when applied to a pure random walk with drift. First, 
I used the sample mean of the first differences to estimate the drift 
term p. at all A rather than estimate a different drift term at each A 
from the mean of the A-differences. Second, I included a degrees of 
freedom correction T/(T - A + 1). Without this correction, 1/A times 
the variance of A-differences declines toward zero as A —► T for any 
process because you cannot take a variance with one data point. 

I will use the notation to denote 1/A times the bias-corrected 
sample variance of A-differences. The formula for d? is presented in 
the Appendix as equation (A3). The Appendix also contains a proof 
that &'i is unbiased when y t is a random walk with drift. 

Table 2 presents standard errors from a Monte Carlo experiment 
using 100 observations of a random walk with drift. I picked the 
innovation variance of this random walk <r« * <r£ x = 1. The mean of 
was very close to one at all A in this experiment, confirming the bias 
corrections for a pure random walk. The table presents the standard 
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TABLE 2 

Monte Carlo Stanoaro Errors for I Ik Times the Variance of A Differences 



Model: v, * 1 

+ »- 

i + 

e? * 

I (T - 

100, 500 trials) 






100 A/T 





1 2 

3 

4 

5 

10 

20 

30 40 

50 

Monte Carlo 

.137 .160 

.200 

.231 

.263 

.409 

.607 

.772 .888 

.896 

Bartlett* 

.115 .163 

.200 

.231 

.258 

.365 

.516 

.632 .730 

.816 


* Thu m«r gives { ii / Xt ) * 


errors from the Monte Carlo experiment and the corresponding 
Bartlett standard errors for comparison. The Bartlett errors slightly 
understate the Monte Carlo errors at large k/T, but the difference is 
small compared to the size of the standard errors. Monte Carlo exper¬ 
iments with different sample sizes and random walk variance confirm 
that the standard errors of table 2 scale with k/T and the innovation 
variance of the random walk. 

What about processes that are more complicated than a pure ran¬ 
dom walk? The Appendix presents a derivation of E(d*) for a first- 
order moving average: (1 - L)y, = p + (1 + BL)e t . It shows that 
approaches ai* for large k, so d* can recover the variance of the 
random walk component for this process as well. 

I ran several further Monte Carlo simulations to examine whether 
the variance of A-differences is robust when applied to more com¬ 
plicated processes for GNP. I fit a variety of ARMA processes to first 
differences of log real per capita GNP, simulated 118 observations of 
each process, and computed &{ in 100 trials. In each case, the mean of 
d* at A = 30 was equal to the variance of the random walk component 
implied by the estimated ARMA processes —k = 30 was large enough 
to identify the random walk from the stationary components—and 
the standard errors at large k were close to those implied by table 2, 
scaled to the variance of the random walk component. 

All the low-order ARMA processes produced d* lines that rise for k 
from 1 to 5 and then are flat at the variance of the random walk 
component from k - 10 on, unlike figure 1. They implied a|, > ai r 
Two processes that do capture the behavior of figure I are an AR(15), 
figure 4, and AR(2) about a deterministic trend, figure 5. In the next 
section, I will discuss why the low-order ARMA models failed to 
capture the behavior of figure 1. For now, note that since they repli¬ 
cate the behavior of d* for GNP, figures 4 and 5 can provide small- 
sample standard errors. These standard errors are similar to the 
asymptotic standard errors used in figure I. 

Figures 4 and 5 also include d* for GNP from figure 1, marked 








4 


■n %' 
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df(GNP). Since the df(GNP) line falls inside the one-standard-error 
bands, neither model can be rejected for real GNP. However, the 
standard errors from the random walk (table 2) or any of the other 
low-order ARMA processes are large enough that we cannot reject 
them at 5 percent either. (Note that the standard errors scale with the 
size of the random walk component. Under the hypothesis of a ran¬ 
dom walk, the standard errors are bigger than indicated in fig. 1.) A 
confidence interval includes both a%Ja\ y - 0 and 1. 

While this is unfortunate, I will argue below that estimates of a 
random walk component are limited by the number of nonoverlap¬ 
ping "long runs” in the data set, so that large efficiency gains are not 
possible without imposing additional structure on the time-series pro¬ 
cess for GNP. As a result, this and related exercises can provide a 
point estimate of the size of a random walk component with associ¬ 
ated standard errors but will not provide useful tests to discriminate 
between models that imply various sizes of the random walk com¬ 
ponent. 

The parameters of the AR(15) model imply that the variance ratio 
oL/cri, =“ .18, while the AR(2) about a trend implies a\ja\ y = 0. 
Hence, the simulations behind figures 4 and 5 also reveal an upward 
bias in d* as an estimate of the random walk component when the 
series has a small random walk component or is trend-stationary. 

In summary, 1 /A times the variance of A-differences d* provides an 
upward-biased point estimate of the variance ratio ol x /o , | y of about 
.34, and two models with cr&Ja'iy = .18 and 0 replicate the behavior of 
the variance of A-differences of GNP. However, standard errors are 
large enough that we cannot statistically reject variance ratios between 
zero and one at conventional levels of significance. 


IV. Reconciliation with Previous Estimates 

Given the definition of the random walk component in terms of the 
parameters of a moving average representation, (4) or (8) above, the 
obvious thing to do is either to estimate a parsimonious time-series 
model for Ay, and calculate ta, or to identify and estimate a simple 
parametric unobserved components model like (4). Campbell and 
Mankiw (1987) and Nelson and Plosser (1982) did just that, respec¬ 
tively, and both found large random walk components. Why do Nel¬ 
son and Plosser and Campbell and Mankiw find large random walk 
components, while Watson (1986), Clark (1987), and I find small 
ones? Though there are small differences in definition—which quan¬ 
tities we look at to measure the importance of unit roots or random 
walk components—the major difference is in estimation strategies. 
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Nelson and Plosser specified an unobserved components model of 
the form 

y t m u t + v„ 

(1 - L)u, = + A(L)e„ c ( i.i.d., (11) 

v, - B{L)h t , 6 ( i.i.d. 

(e t and Sf may be correlated). They identified the two components 
from a stylized autocorrelation function of GNP growth rates. If the 
first autocorrelation of Ay, is positive but the others are zero, then the 
only model of the form (11) that works is A (L) = 1 and B(L) = (1 + 
%L). By examining plausible parameter values for this restricted 
model. Nelson and Plosser concluded that <r* > o^. 8 

Campbell and Mankiw (1987) estimated parsimonious ARM A rep¬ 
resentations of log GNP, using seasonally adjusted quarterly postwar 
data. They measured the importance of the random walk component 
by So, = A(l), the change in z, (the long-term forecast) in response to 
a unit univariate innovation in GNP. They found values for A(l) 
equal to or larger than one, which imply an innovation variance of the 
random walk component greater than the variance of first differences 
of GNP. 9 


* This measure of the importance of a random walk component has the conceptual 
disadvantage that it depends on which arbitrary unobserved components decomposi¬ 
tion we choose. For example, since every series of the form (11) has a unique moving 
average representation, we could rewrite (11) as (1 - L)y, = p + C(L)v„ v, i.i.d., and 
eliminate the stationary component. Alternatively, we could use the Beveridge and 
Nelson decomposition of Sec. II to make the component with a unit root into a pure 
random walk: 


y, *= z, + e„ 

(1 - L)i, = p + v„ v, i.i.d., 
c t B C(L)(,, i, i.i.d. 

These representations are observationally equivalent to the first form (11), but the 
measure trj/<rj changes according to which one we choose. In contrast, the innovation 
variance of a random walk component is invariant to the choice of decomposition (fact 
2 in Sec. 11). Also, the ratio of the innovation variance of the two components is not a 
good measure of their relative importance because the proportion of the variance of Ay, 
explained by u, and v, depends on the coefficients of A(L) and B(L) as well as the ratio 
trj/ol. 

9 There are some conceptual disadvantages to scaling a persistence measure by the 
univariate innovations of y,. The univariate innovations are not observable and must be 
inferred from a model; the univariate innovations do not correspond to the •‘surprise" 
movement because we live in a multivariate environment; a series may have small 
innovations but a large variance, For example, Ay, =* 1.5Ay ,-1 - .95Ay,~* + *,■ For this 
process, la, * 2.22 but = (S^)*/(ia*) * 0.20. However, for the GNP data used 

in this paper, there is little qualitative difference between the two definitions, and the 
difference in results must be explained by differences in estimation strategy. 
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In performing the Monte Carlo simulations of Section II, I also 
found that low-order ARMA models of GNP imply that of should rise 
with k, and they imply a large random walk component, while in fact 
of declines and the estimated random walk component is small. To 
replicate the behavior of o* for GNP, I had to estimate an AR(15) or 
impose a deterministic trend. 

To investigate this fact further, I fit a variety of ARMA processes to 
GNP growth rates, ranging from white noise out to an AR(15) (see 
table 3). 10 All representations past white noise are adequate by usual 
standards: the Durbin-Watson statistics are close to 2, the significance 
levels of the Q-statistic are around .5, the parameters of overfit models 
are statistically insignificant, and so forth. But the variance ratio and 
Say start at about 1.2 for second-order processes and decline steadily 
to a variance ratio of .18 and Say = .5 for an AR(15). Low-order 
ARMA models systematically overestimate the random walk compo¬ 
nent of GNP, even though they adequately represent the series by all 
the usual diagnostic tests. The question is, why? 

The innovation variance of a random walk component is a property 
of the very long-run behavior of a series alone. It is the spectral 
density at the frequency u> — 0 corresponding to a period or “run” of 
infinity, it is related to the infinite sum of the moving average co¬ 
efficients | Say | 2 or the autocorrelation coefficients (1 + 2Spy), and it 
corresponds to the effect of a shock today on forecasts into the infinite 
future. In theory, then, we should have to wait an infinite amount of 
time to get just one observation on the size of the random walk com¬ 
ponent! 

In practice, we typically believe that the dynamic response of GNP 
to a shock is flat after a suitable long run has arrived . 11 This belief is 
implicit above: the graphs stop after the thirtieth difference, reflect¬ 
ing a belief that after 30 years the temporary effects of business cycles 
are over. The number of nonoverlapping long runs is a rough guide 
to the number of degrees of freedom (precisely, the number of pe- 
riodogram ordinates) in this exercise. With a 10-20-year long run 
there are no more than five to 10 independent observations in 100 
years of data and two to four observations in postwar data. Obviously, 
using more frequently sampled data does not help. 

Estimating an unobserved components model or a parsimonious 


10 I used the rats program to perform the estimation. Autoregressive models are 
estimated by ordinary least squares and moving average models by conditional max¬ 
imum likelihood. The unreported moving average models did not converge. 

11 Precisely, if the coefficients of the moving average representation (1) are zero past 
a long-run value Af < ®, then the derivative of the spectral density of Ay at zero is 
bounded. If y, is in fact trend-stationary and the spectral density of Ay at frequency zero 
is in fact zero, then the slope of the spectral density of Ay at zero is also zero. 
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ARMA model is an attempt to circumvent this problem. These mod¬ 
els make identifying restrictions across frequencies: they draw in¬ 
ferences about the long-run (high-order autocorrelation or low- 
frequency) dynamics from a model fit to the short-run (low-order 
autocorrelation or high-frequency) dynamics. For an example that 
demonstrates how “effective" these procedures are, Campbell and 
Mankiw (1987) report estimates such as d(l) = 1.306 ± .073 for the 
20-year forecast of GNP. Since there are only two nonoverlapping 
20-year forecasts in their data set, it is clear how heavily their esti¬ 
mates of A(l) depend on the identifying assumption that the series 
follow a given low-order ARMA model. 

If the short- and long-run dynamics of GNP can both be captured 
by the assumed time-series model, these procedures can help estima¬ 
tion because we have much more data on high-frequency fluctuations. 
However, if the long-run dynamics cannot be captured in the model 
used to study the short run, these identification procedures bias con¬ 
clusions about long-run behavior. 

I offer two ways to see this fact. First, recall that the variance of the 
shock to the random walk component is related to the sum of the 
autocorrelations by 

-£§*-= l + 2£p,. (12) 

jm 1 

When we model short-run dynamics, we safely ignore high-order 
statistically insignificant autocorrelations or we slighdy misspecify 
them by fitting a simple model. But all autocorrelations enter into (12) 
equally, so a large number of small high-order autocorrelations can 
offset a few large low-order autocorrelations. 

Second, GNP growth has a positive autocorrelation at short lags 
and a small random walk component at long lags. A simple time- 
series model may not be able to capture both kinds of behavior. For 
example, if (1 - L)y t = p + (1 + 0Z.)e„ we need 0 > 0 to capture 
positive first-order autocorrelation but 0 < 0 to capture a small ran¬ 
dom walk component. Faced with a choice, maximum likelihood esti¬ 
mates match the short-run behavior (they fit 0 > 0 in the example) 
and misrepresent the long-run behavior. 

The Appendix contains a demonstration of this property of max¬ 
imum likelihood estimates. It shows that maximum likelihood esti¬ 
mates of a model such as a low-order ARMA or a simple parametric 
unobserved components model pick parameters that match the mod¬ 
el's and the actual spectral density over the entire frequency range. 
Therefore, maximum likelihood will sacrifice accuracy in the small 
region around w = 0 to better match spectral densities at higher 
frequencies. 
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In summary, the low-order ARMA approach of Campbell and 
Vlankiw and the unobserved components approach of Nelson and 
Plosser cannot match the short-run dynamics and the small random 
walk component in the long-run dynamics at the same time. Faced 
with the choice, they capture the short-run dynamics and incorrectly 
mply large random walk components. 

On the other hand, Clark’s (1987) and Watson’s (1986) decomposi¬ 
tions can accommodate the behavior of GNP in both frequency 
ranges. (See, e.g., Watson’s fig. lb, in which he shows how his model 
ran represent a large number of small high-order autocorrelations 
hat a low-order ARMA cannot match.) Both Watson and Clark find a 
unall random walk component. However, their decompositions also 
imply idemifying restrictions to estimate long-run behavior from 
short-run dynamics. Since these restrictions are no more or less plau¬ 
sible than Nelson and Plosser’s or Campbell and Mankiw’s, they might 
not be able to capture the pattern of high-order correlations in other 
data sets as they seem to do for GNP. 

Since the size of the random walk component is a property of the 
periodogram ordinate at frequency zero alone, any estimation tech¬ 
nique must make some identifying restriction across the frequency 
range. The variance of A-differences assumes that past a certain k the 
random walk component is adequately identified, empirically deter¬ 
mined as the point in which the graph (fig. 1) flattens out. Therefore, 
the variance of A-differences (or any other spectral window estimator) 
uses 10-20-year period information to identify the infinite-run prop- 
srty, the random walk component. The variance of A-differences does 
not use information about dynamics at business cycle frequencies to 
identify long-run movements, and this is its important advantage. 

/. Conclusion 

The variance of A-differences (fig. 1 and table 1) produced a point 
estimate that the innovation variance of the random walk component 
of GNP is about one-third the variance of yearly GNP growth rates. 
That estimate is upward biased for small random walk components: 
the parameters of two models that replicated the behavior of the 
variance of A-differences of GNP implied variance ratios of .18 
(AR(15)) and 0 (AR(2) about a deterministic trend). I conclude that if 
there is a random walk component in GNP at all, it is small. 

Another way to characterize these results, without reference to ran¬ 
dom walk components, is that GNP growth is positively autocor- 
related at short lags, but there are many small negative autocorrela¬ 
tions at long lags. These bring future GNP back toward, if not all the 
way back to, its previously forecast value following a shock. 
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These results do not mean that "GNP follows an AR(2) about a 
deterministic trend." Our forecasts of the future may quite rightly be 
much more variable than the “trend” in GNP we have seen in the 
recent 118-year past might suggest. 12 These results do mean that an 
AR(2) about a deterministic trend or a difference-stationary ARMA 
process with a very small random walk component is a good in-sample 
characterization of the behavior of GNP. 

In reconciling these results with previous research, I argued that 
conventional criteria for time-series model identification and estima¬ 
tion can produce misleading estimates of the random walk compo¬ 
nent of a series like GNP. The random walk component is a property 
of ail autocorrelations taken together, but conventional procedures 
concentrate on the first few autocorrelations in order to parsimoni¬ 
ously capture short-run dynamics. When used to estimate the size of a 
random walk component, they impose identifying restrictions across 
the frequency range to infer the long-run properties of a series from 
its short-run dynamics. I argued that, in the absence of credible iden¬ 
tifying restrictions, it is best to leave the short run out altogether, as 
the variance of ^-differences or some other spectral window estimator 
does. 

However, this view—that we should use only long-run properties 
of GNP data to estimate the long-run behavior of GNP—implies that 
standard errors of univariate estimates of the random walk compo¬ 
nent will remain large in century-long macroeconomic data and 
larger still in postwar macroeconomic data because there are inher¬ 
ently few nonoverlapping long runs available. These observations ar¬ 
gue against the research strategy that says that the presence of a unit 
root and the size of a random walk component are crucial and well- 
documented stylized facts that any theoretical model must replicate. 


Appendix 

A. Derivation of Equation (9) 

Start with 

OP 

(1 - L)y, - p. + A(I)e< = p. + (Al) 

;-o 


'* A plausible model for GNP should have some random walk component. If GNP i» 
truly stationary about a linear trend, then the variance of the forecast error of the level 
of GNP is the same for all dates in the far future. As long as there is some random walk 
component, the variance of forecast errors will grow unboundedly over the forecast 
horizon. However, only a very small random walk component is required to achieve this 
desirable property. 



IN DOM WALK IN GNP 


917 


»mg 

(1 - L*)(l - L)~' - (1 + L + L z + . .. + L*-'), 

*-*_* = Ap + X (X«/U-y + X ( X «<)«<->• 

y-0 0 / y-* 'i-y-*+i / 

iking its variance, 

af - A' 1 var(* - y,-*) = *"'[X(X a <) + X( X «*) 

Ly-OW-O ' }-k'l-i-k+\ > J 

3 simplify the algebra, express <r* as a difference equation 

oo A- 1 

farf ~ - 1)0-1 - j = £X ( a ? + 2a i X °> + ')] 

** = (X °?) 


(A2) 


y«i <*1 

iere p, = the yth autocorrelation of (1 - L)y ( , p, = IT-o a/at +J /ST-o «?■ 
herefore, 

^ = *- l [l +(1 + 2p,) + (l + 2p, + 2pi) + ...]“ 1 + 2 X —T^~Py- 
*1 y-i * 


Derivation of E(d£) for an MA(1) 
«ume that (Al) takes the form 


(1 - L)y, = p. + (1 + 9L)t, 

id assume that e, are i.i.d. normal. The data set is T + 1 observations of the 
/els of y, or T observations of its first differences. By definition, 


*2 = 


T 

k(T - k){T - A 




(A3) 


juation (A2) specializes to 

*-1 

y, ~ »-* = *|1 + «, + 6c,-* + (1 + 0) X *»-<> 

i -1 


id similarly for yr - yo- Collecting terms in t, and noting that £(e,e*) 
4 A, we get (after some algebra) 


£(*?) » (1 + 6)V 


2 _ 

« 


26 1 + (k*/T*) - [2 k/T(T - A - 1)J , 
A 1 - (A/T) 


0 if 


A 
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Note that (1) as T~* ®, E(6 f) -♦ [(1 + 6)* - (2«/*)laJ; (2) as *-»«,*< T, 
E(&t) (1 + 0)*<r? * aL; (3) for 6 = 0, £(df) = «r; = tr|, for alt *, T such 

that k < T. 


C. How Maximum Likelihood Imposes Identifying 
Restrictions across Frequencies 

Letx, — (1 — L)y, * A(L)t t . Assume that A(0) - 1, that A (L) is one-sided and 
has zeros outside the unit circle, so that the spectral density of x is bounded 
away from zero, and that A has an inverse, so that x has an autoregressive 
representation B(L)x, = Consider estimating A(L) or B(L) by maximum 
likelihood via a simple time-series or unobserved components model. For 
simplicity, assume infinite data, «, ~ N( 0, <r 2 ), and <rf known. (The same point 
survives generalization to more complex estimation environments.) In this 
case, maximum likelihood is equivalent to 

min £[/f(L)x,] 2 subject to &(L) £ 91, (A4) 

where &(L) is the autoregressive representation of the estimated model, and 
91 is the restricted space of autoregressive representations allowed by the 
chosen time-series model. Since variance is the integral of spectral density, 
(A4) is the same as 

min(2ir' 1 ) p |^(r-“)| 2 S,(«-“)iiw subject to 6 («"*)£ 9B. (A5) 


The following expression is equivalent: 

min - fi(e*“)|*S x (e-**)<fc» subject to Me'") £ 9ft. (A6) 

To see this, expand |/5 - B |* and substitute A A*aJ = S* (an asterisk denotes 
complex conjugation: I dropped the e~ m ’s). Then (A6) becomes 

min f" (M* + BB* - B&* - B*&)AA*dw. (A7) 


The first term is just (A5). Since A 1 * B, the second term is 2tr, and the 
third and fourth are 



+ 6*A*)da>. 


Under the assumption that A and B are one-sided and that A(0) = fl(0) = 1, 


| A Ada * | (l + ^ bje~"*l 1 + ^ a,r~*^jda>; 

since e~"*<k u - 0, /* w &Adw = /!„ &*A*du> = 2-11. Therefore, (A6) 

reduces to (A5) plus constants. 

Equation (A6) is analogous to Sims’s (1972) approximation formula, repro¬ 
duced in Sargent (1979, p. 29S). The message of (A6) is that maximum 
likelihood attempts to match the frequency response of the autoregressive 
representation across the entire frequency range, weighted by the true spec¬ 
tral density of x,. The method of maximum likelihood will sacrifice accuracy 
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the estimated B(e~ m ) at a point in the frequency range (u = 0) in order to 
hieve a better fit over an interval. Similarly, it will sacrifice accuracy in a 
jail window (20 years to infinity is -ir/10 wide) to gain accuracy in a large 
indow (2-4 years is ir/2 wide). If S z (e~ m ) is smaller near w = 0 than else- 
tiere, as the variance of A-differences suggests for GNP, then (A6) shows 
at maximum likelihood further deemphasizes accuracy in windows about 
•o. 
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The Relation between Price and Marginal 
Cost in U.S. Industry 


Robert E. Hall 

Stanford. University and National Bureau of Economic Research 


An examination of data on output and labor input reveals that some 
U.S. industries have marginal cost well below price. The conclusion 
rests on the finding that cyclical variations in labor input are small 
compared with variations in output. In booms, firms produce sub¬ 
stantially more output and sell it for a price that exceeds the costs of 
the added inputs. The paper documents the disparity between price 
and marginal cost, where marginal cost is estimated from annual 
variations in cost. It considers a variety of explanations of the 
findings that are consistent with competition, but none is found to be 
completely plausible. 


I. Introduction 

A competitive firm equates its marginal cost to the market price of its 
product. The equality of marginal cost and price is a fundamental 
efficiency condition for the allocation of resources. When the condi¬ 
tion holds, the purchasers of the product equate their marginal rates 
of substitution to the corresponding marginal rates of transforma¬ 
tion. By contrast, under monopoly or oligopoly, the allocation of out¬ 
put will be inefficient because price will exceed marginal cost. 

This paper derives and applies a method for testing the equality of 
price and marginal cost. The method is different from the one used 
in most previous investigations: instead of assuming profit maximiza- 
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lion and estimating the slope of the demand schedule (as in the stud¬ 
ies surveyed by Bresnahan [in press]), it looks at actual changes in 
costs. Further, the method makes no parametric assumptions about 
the cost function. It tests equality of price and marginal cost directly 
from data on price, output, and the quantities and prices of inputs. 

The test developed in this paper rests on the assumption of con¬ 
stant returns to scale. That is, the hypothesis being tested is the joint 
hypothesis of competition and constant returns. Because competition 
is inconsistent with increasing returns, it is appropriate to test the two 
hypotheses together. In order to sustain the interpretation that rejec¬ 
tion of the joint hypothesis is unfavorable to the hypothesis of compe¬ 
tition, I show that rejection could not have been caused by decreasing 
returns to scale. 

The conclusion from data for seven industry groups is that the joint 
hypothesis of equality of price and marginal cost and constant returns 
is strongly rejected for live groups and is rejected at lower levels of 
significance for the other two. These findings are confirmed for 26 
more detailed industries. The paper gives attention to possible 
specification and data problems that might explain the findings with¬ 
out invoking a failure of competition and constant returns. The prob¬ 
lems considered explicitly are measurement errors in labor input 
from unmeasured fluctuations in effort per hour of work and other 
sources, errors in measuring output and wages, labor contracts with 
wage smoothing, adjustment costs, price rigidity, and labor aggrega¬ 
tion. 

II. Method 

The essence of the proposed test is to measure marginal cost as the 
observed change in cost as output rises or falls from one year to the 
next. The comparison of movements of inputs with movements in 
output is at the heart of the calculation. Hence the test is closely 
related to the measurement of productivity growth. The starting 
point for the development of the test is a discussion of the properties 
of a particular productivity measure under the null hypothesis of 
competition and constant returns. The covariance of measured pro¬ 
ductivity and an instrumental variable is shown to be zero under the 
null. Then the value of the same covariance is derived under the 
alternative hypothesis of market power with constant returns. The co- 
variance must be positive. This discussion is much more fully devel¬ 
oped because the productivity literature has not discussed the impact 
of market power on productivity measurement at any great length. In 
particular, it is in this section that the relation between measures of 
marginal cost and measures of productivity growth is developed. Fi- 
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nally, I discuss more briefly why the covariance also will be somewhat 
positive in the presence of increasing returns to scale. 

Characterization of the Null Hypothesis: 

Competition and Constant Returns 

In general, I will be concerned with a firm that produces output Q 
with a production function 0 F(K, N) using capital K and labor N as 
inputs; 0 is an index of Hicks-neutral technical progress. The firm 
faces a stochastic demand for its output, possibly perfectly elastic. It 
faces a labor market in which the firm can engage any amount of 
labor at the same wage, w. Sometime in advance of the realization of 
demand, the firm chooses a capital stock. I do not assume anything 
about the market for capital goods, nor, for that matter, do I assume 
that the firm’s investment policy is optimal. However, I do assume 
that the pure user cost of capital is zero: capital depreciates over time, 
not in relation to use. I also assume that the firm chooses its labor 
input so as to maximize profit and that the choice is made after the 
realization of demand. Finally, I assume that there is at least one 
observable variable that shifts the demand schedule, the labor supply 
schedule, or the level of capital used by the firm. 

In a famous paper, Solow (1957) derived a relationship involving 
output growth, product price, capital and labor input, and the wage 
rate, under the assumptions of competition and constant returns to 
scale. The relationship is 

A q, - ajAn, = 8,. (I) 

where &q is the rate of growth of the output/capital ratio (A log[Q/K]), 
a is the factor share earned by labor (ratio of compensation wN to 
total revenue pQ), An is the rate of growth of the labor/capital ratio 
(A \og[N/K]), and 8 is the rate of Hicks-neutral technical progress 
(A log 0). Solow recommended evaluating the left side in order to 
measure the rate of growth of productivity. This measure has come to 
be known as total factor productivity because, unlike measures that 
consider only output and labor input, it accounts for capital input 
and, in a more general form, for all other types of inputs. 

The statistic on the left side of equation (1) has come to be known as 
the “Solow residual” and plays a crucial role in the test developed in 
this paper. The economics of the residual are straightforward. Under 
competition and constant returns, the observed share of labor is an 
exact measure of the elasticity of the production function. Without 
any further restriction on the production function, the elasticity can 
be read directly from the data on compensation and revenue. Once 
the elasticity is known, the rate of productivity growth can be obtained 
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simply by subtracting the rate of growth of the labor/capital ratio, 
adjusted by the elasticity, from the rate of growth of output. 

Solow had in mind the calculation of the rate of growth of produc¬ 
tivity, 8„ separately for each year. Because productivity growth seems 
to have a substantial random element, it is natural to view 6 t as the 
sum of a constant underlying growth rate, 6, and a random term, u t . 
Then equation (1) becomes 

Aq t - a An, = 0 + u,. (2) 

Now suppose that there is a variable, say Az t , that is an important 
outside determinant of output and employment. It could be govern¬ 
ment purchases of the output of this industry, or a measure of the 
shift of labor supply to the industry, or something else that affects Aq 
and An. Suppose further that the variable Az is exogenous to this 
equation; that is, it is uncorrelated with the stochastic element of 
productivity growth, it,. In other words, the variable Az is of a type 
that is known from prior reasoning not to cause shifts in productivity 
or to be influenced by productivity shifts that come from other 
sources. Later in the paper I will suggest that the change in military 
spending is one such variable. If the variable Az, has zero correlation 
with the right side of equation (2), it must have zero correlation with 
the left side as well. This establishes the following proposition. 

Proposition 1. Invariance of the Solow Residual .—Under competi¬ 
tion and constant returns to scale, the Solow residual is uncorrelated 
with all variables known neither to be causes of productivity shifts nor 
to be caused by productivity shifts. 

When a convincingly exogenous variable is found to be correlated 
with the Solow residual, it refutes the joint hypothesis of competition 
and constant returns. The next step is to investigate the power of the 
test and the interpretation of rejection. I will demonstrate that, for 
the case of an instrumental variable Az that is positively correlated 
with output and employment, a positive correlation of Az and the 
Solow residual is most likely a sign of market power. Increasing re¬ 
turns can also explain a slight positive correlation. Conditions of com¬ 
petition with constant or decreasing returns to scale are incompatible 
with a positive correlation of the residual and the instrument. 


Characterization of the Alternative Hypothesis: 

Market Power 

In order to motivate the discussion of the implications of market 
power, I will consider the idea of measuring marginal cost and com¬ 
paring it with price in order to measure market power. The markup 
ratio—the ratio of price to marginal cost—is a good measure of mar- 
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ket power. Consider the problem of measuring marginal cost for a 
firm with a fixed capital stock and an unchanging technology over 
time. From one period to the next the change in its labor input is AN. 
A reasonable approximation to its change in labor cost, abstracting 
from changes in wages, is wAN, where w is the current wage. The 
corresponding change in output is A Q. Let x be marginal cost. Then a 
good measure of marginal cost is 


wAN 
* “ A Q- 


( 3 ) 


The only element of approximation here arises from the use of finite 
differences; the corresponding expression in derivatives is exact. It is 
convenient to rewrite the expression for marginal cost as a relation 
between the rate of growth of output and the rate of growth of labor 
input: 


AQ _ wN AN 
Q " xQ AT 


(4) 


That is, the rate of growth of output is the factor share, wN/xQ, times 
the rate of growth of labor input. Recall that in the competitive case 
considered by Solow, the denominator was revenue. Here it is output 
valued at marginal cost, xQ. Again, the factor share measures the 
elasticity of output with respect to input, independent of the form of 
the technology. 

Now let p be the markup ratio, p = p/x, and, as before, let a be 
labor’s observed share in revenue. Then the relation between these 
variables can be written in the earlier notation as 


A q, = M-ia,An,. (5) 

Here I have written each of the variables with a time subscript to 
emphasize that they can change over time. No assumption of con¬ 
stancy of either p or a is made. In what follows, a ( will always be 
considered time-series data. Under the null hypothesis of competi¬ 
tion, p has the constant value of one, but there is no assumption of 
constancy under the alternative hypothesis. Equation (5) holds for 
any demand function and any technology when the capital stock is 
constant. 

Equation (5) also holds with a slight modification and reinterpreta¬ 
tion for a firm whose capital stock varies over time and that enjoys 
technical progress. The measure of marginal cost that is analogous to 
equation (3) is 

u/AN + rAA 
AQ- 6Q • 


x = 


( 6 ) 
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The change in cost in the numerator now includes a term rAK, which 
is the cost of the change in the capital stock, AK, evaluated at the 
actual service cost of the new capital, r. Alternatively, if the firm is not 
in equilibrium with respect to its use of capital, r is the shadow value 
of capital. In any case, r is not the rate of profit calculated as a resid¬ 
ual. The denominator in the calculation of marginal cost has an addi¬ 
tional term, -6Q, representing an adjustment for the amount by 
which output would have risen in the absence of additional capital or 
labor, assuming that Hicks-neutral technical progress is occurring at 
rate 0. 

Again, it is convenient to rewrite the equation for marginal cost as a 
relation between the rate of growth of output and the rates of growth 
of inputs: 

AQ _ W N AN , rK AK ,,, 

T**”* + '^Tr + "' (,) 

Unlike its counterpart, equation (4), this relation is not directly usable 
because the shadow value of capital, r, is not generally observed. 
Under constant returns to scale, however, it is possible to eliminate r 
from equadon (7). With constant returns, the two shares xvN/xQ and 
rK/xQ are compeddve factor shares; that is, they sum to one. Inserting 
this constraint into equation (7) and rearranging gives 

AQ AK wN (AN AK\ , „ 

-g —ir * w (~n — r ) + 9 - (8) 

In the notation used earlier, this is 

Aq t = ti&tAn, + 0,. (9) 

Equation (9) expresses the basic idea of the paper. The relation 
between price and marginal cost can be found by comparing the 
actual growth in the output/capital ratio with the growth that would 
be expected given the rate of technical progress and the growth in the 
labor/capital ratio. The baseline for converting labor growth into out¬ 
put growth is to multiply by labor’s share in revenue, a. Under com¬ 
petition, labor’s share measures the elasticity of output with respect to 
labor input. In that case, p will be one; marginal cost and price will be 
equal. When marginal cost falls short of price, because firms perceive 
that raising output to the point of equality will depress the price, then 
p will be shown to exceed one. Equation (9) could be used in two ways. 
First, if the data contain no errors and the rate of technical progress is 
known, then it can be solved for p in each year: 
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Second, in practice the rate of productivity growth will not be known. 
The statistical model of productivity growth introduced earlier con¬ 
siders it a constant, 8, plus a random disturbance,Then the Solow 
residual under market power is 

Af ( - ottAn, = (p, — l)a,An, + 0 + u t . (11) 

The covariance of an instrumental variable with the Solow residual 
will differ from zero because of the term with p - 1. This establishes 
proposition 2. 

Proposition 2. In the presence of market power, the covariance of 
an exogenous instrumental variable Az and the Solow residual is 

cov(A q - aAn, Az) = cov[(p, - l)ot|An, Az]. (12) 

To simplify the discussion, I will assume, without loss of generality, 
that the instrument Az is positively correlated with weighted employ¬ 
ment growth, aAn. 1 will argue that it is altogether likely that the 
covariance of the residual and the instrument is negative or zero 
under competition and positive only under market power. First, if the 
markup ratio, p„ is a constant, it is immediately apparent that the co- 
variance will be positive if and only if p exceeds one. Second, the 
validity of the test based on the covariance extends to cases of variable 
markup ratios. In particular, if the markup varies along with the 
instrument in a linear fashion, if weighted employment growth also 
varies linearly with the instrument, as follows: 

p, - 1 = a + bAz, (13) 

aAn = f + dAz, (14) 

and if the instrument is distributed symmetrically around a zero mean 
with variance a 2 , then the covariance is 

cov(Af - aAn, Az) = (be + ad)o 2 . (15) 

Here I have assumed that the first and third moments of Az are zero. 
By hypothesis, d is positive because employment is positively cor¬ 
related with the instrument. If competition prevailed on the average, 
then a would be zero since it is the mean of p - l. The parameter c is 
the average growth rate of the labor/capital ratio and is slightly nega¬ 
tive in all cases. If the markup ratio were positively related to the 
instrument (b > 0), then the covariance would be slighdy negative 
under competition. The only possibility of a false rejection would 
occur if the rate of growth of the labor/capital ratio were quite nega¬ 
tive (c < 0) and the markup ratio were strongly negatively related to 
the instrument (6 < 0). With market power, the term ad would be 
positive and the test would reject competition unless the be term were 
negative enough to offset ad. 
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To summarize, the covariance of the Soiow residual with the instru¬ 
ment will be dose to zero under competition and positive under mar¬ 
ket power. The only qualifications are that the covariance could be 
slightly positive under competition if the average growth rate of the 
labor/capital ratio were very negative and the markup were strongly 
negatively correlated with the instrument, and that the covariance 
could be zero or negative under market power if the markup were 
strongly positively correlated with the instrument. 

The proposed test rests on the simple proposition that, to the ex¬ 
tent that the firm is noncompetitive, its measured productivity will be 
associated with its rate of growth of labor input over fluctuations 
associated with an exogenous instrument. When productivity rises 
along with employment in response to an outside force, it is a sign that 
the firm is not competitive. 


Example 1: Overhead Labor 

An example will demonstrate how the method deals with a technol¬ 
ogy that seems to describe a number of important U.S. industries. A 
firm has capacity K. In order to produce any output at all, it must hire 
\K overhead workers. In addition, for each unit of output, it must 
hire <|> workers. Thus to produce a level of output Q, it must have a K 
at least as large as Q and employment of kK + <j>Q. The firm’s mar¬ 
ginal cost is whenever Q< K and can be taken to be any number 
above u/<t> when the capacity constraint is binding. In competitive equi¬ 
librium, p - «*f> whenever Q < K and p^-wt j> whenever Q = K. Note 
that the technology has constant returns to scale (the fixed component 
of labor is proportional to capacity, not absolutely fixed), so a compet¬ 
itive equilibrium is possible. Now consider the measurements pro¬ 
posed in this paper for a period in which output is below capacity. 
Labor’s share will be 


_ wN _ wiifQ + kK) 

a ~ ~pQ ~ <M2 


Because the competitive firm operates at a loss whenever its output is 
below capacity, the share exceeds one. The Soiow residual is 


hq — aAn 


AQ AN 

Q N 

AQ _ w((f>Q + kK) 4>AQ 
Q <|>7tjQ <|>Q + XK 


= 0 . 


(17) 
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Thus the Solow residual remains unchanged when an outside force 
alters the levels of output and employment. The covariance of the 
Solow residual and an exogenous instrument is zero, and the pro¬ 
posed test will reveal, correctly, that the firm is competitive. Even 
though the variation in labor input itself may be very small because 
most workers are overhead workers, the competitive value of a ex¬ 
ceeds one by enough to make a An equal &q. The mere existence of 
overhead labor does not lead to the rejection of competition. 

In practice, for those industries that appear to have large overhead 
labor requirements and small variable labor requirements, the behav¬ 
ior of the labor share a and the resulting covariance of the Solow 
residual and an instrument are not at all what is described by the 
competitive model just summarized. Rather, when such an industry 
operates below capacity, its price remains far above the cost of the 
variable component of labor. Profit often remains positive, so labor’s 
share, a, is less than one. The ratio of bq to otAn is, say, three, not one. 
The appropriate conclusion is that price is three times marginal cost, 
and the firm is far from competitive. The Solow residual rises sharply 
whenever an outside force causes employment and output to rise. 


Example 2: Labor Hoarding 

The most widely advocated explanation of the positive correlation of 
output and productivity is that firms carry workers through slumps 
because discharging them would dissipate the value of their job- 
specific human capital. In a simple version of the labor-hoarding 
model, the firm would lay off only <J>AQ of its workers if a slump 
caused by adverse external developments caused output to fall by AQ. 
Additional workers would be kept on even though they were idle. 
The economics are then identical to the first example. Marginal cost is 
u«j> and the competitive price should fall to this level. Then a will be 
well above one, so that the Solow residual is invariant to a shift in 
employment and output, even though the change in output is much 
larger than the change in employment. However, if the firm is not 
competitive so that price does not fall enough to make a large, then 
the Solow residual will rise when an outside force raises employment 
and output. 

Characterization of the Alternative Hypothesis: 

Increasing Returns 

The preceding discussion considered the behavior of the test under 
constant returns to scale. This subsection considers departures from 
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that assumption. Going back to equation (7) and restating without the 
assumption of constant returns but assuming competition, I get 

- aAn = (a + p - \)£Jc + 0. (18) 

Here p is capital’s factor share, rK/pQ, and M is the rate of growth of 
the capital stock. Note that this reduces to equation (1) under constant 
returns, where a + P = 1. It is reasonable to suppose that an instru¬ 
mental variable that was positively correlated with employment and 
output growth would also be positively correlated with the growth of 
capital. Under this assumption, it is apparent from equation (18) that 
-the covariance of an instrument with the Solow residual will be posi¬ 
tive under increasing returns (a 4- P > 1) and negative under de¬ 
creasing returns (a + p < 1), as in the following proposition. 

Proposition 3. When price and marginal cost are equal, the 
covariance of the Solow residual with an instrumental variable will be 
positive under increasing returns to scale and negative under decreas¬ 
ing returns to scale. 

To summarize: Competition and constant returns imply that the 
Solow productivity residual is uncorrelated with an exogenous instru¬ 
mental variable. “Exogenous” means that the variable is neither a 
cause of productivity fluctuations nor a result of those fluctuations. In 
the presence of market power, the covariance of the Solow residual 
and the instrument will be positive, except under very unusual condi¬ 
tions. When an outside force causes output and employment to rise, 
the elasticity of the relation between the two variables will be greater 
than the observed factor share of labor. Too little weight will be given 
to the increase in labor input in the calculation of the Solow residual; 
it will record an increase in measured productivity. The same thing 
would happen in the unlikely case of increasing returns in the pres¬ 
ence of price equal to marginal cost. 

III. Value Added 

In addition to the labor and capital considered in the previous section, 
firms use materials and other intermediate products as inputs to pro¬ 
duction. When time-series data on other inputs are available, it is a 
simple matter to add additional terms to equation (9), each containing 
a factor share multiplying a rate of growth of an input. But it is also 
possible to make use of annual data on nominal and real value added 
in place of full input-output data. This section modifies the earlier 
analysis to deal with that case. In this section, variables with asterisks 
signify measures of the theoretical ideal: Q* is true gross output, q* is 
the log of the ratio of Q* to capital, p* is the actual price of output, y* 
and a* are the factor shares of materials and labor relative to the 



PRICE AND MARGINAL COST 


93 1 

value of gross output, p*Q?, 6 * is the rate of Hicks-neutral technical 
progress in the production function relating gross output to all in¬ 
puts, and p* is the ratio of the actual price to full marginal cost. Also, 
v is the price of materials, M is the quantity of materials employed, 
and m is the log of the materials/capital ratio. Then a simple extension 
of equation (9) shows how the hypothesis of competition could be 
tested in this setup: 


bq* — a*An - y*Am - (p* - l)(a*An + 7 *Am) + 0 *. (19) 


The left-hand side is the Solow residual generalized to include materi¬ 
als. The first term on the right-hand side shows that the Solow resid¬ 
ual will be positively correlated with an exogenous instrument when 
the firm has market power, that is, when p* exceeds one. In the case 
at hand, the output measure that is available is not Q*, gross output, 
but Q, real value added. In that case, the test based on the simple 
Solow residual, computed from real value added and employment 
growth, is a valid test. The rate of growth of the ratio of real value 
added to the capital stock is 


_ A(Q/K) _ p*A(Q*/K) - vA(M/K) 
** Q/K (p*Q*/K) - (vMIK) 

A (Q_*/K) vM A(M/K) 

QIK p+Q* MIK 

1 - 

p*Q* 

Aq* — y*A m 
I—7* 


( 20 ) 


This relation can be used to eliminate the unobserved Aq* from equa¬ 
tion (19): 


Aq - aAn * (p* - l)^xAn + -j—^—— Amj + 0. (21) 

Here a is the labor’s share in value added and 0 is the rate of technical 
progress stated in labor-capital augmenting form (0 = 0 */[l - 7 *]). 
Equation (21) says that the Solow residual calculated from value 
added will be equal to the rate of technical progress, appropriately 
defined, if and only if the firm is competitive, that is, p* is one. The 
covariance of the Solow residual with an exogenous instrument will be 
zero under competition and will be positive under market power. 
This statement is subject to the same minor qualifications stated in the 
previous section plus the additional one that the growth of material 
inputs, Am, be positively correlated with the instrument. None of the 
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instruments employed in this paper is likely to fail the latter require¬ 
ment. One instrument—the rate of decline of the world oil price—is 
particularly suitable because its substitution effect adds to the impact 
in the appropriate direction. 

The discussion in this section made the implicit assumption that the 
change in real value added was computed each year using the previ¬ 
ous year’s prices as the base prices (see eq. [20]). In effect, it assumed 
the use of a Divisia index of real value added. In the U.S. national 
income accounts, base prices are changed about once a decade. I 
know of no reason to think that the low frequency of base changes has 
any important influence on the results obtained by the technique in 
this paper. 


IV. Choice of the Instrumental Variables 

The instrumental variables for the test should cause important move¬ 
ments in employment and output but be uncorrelated with the ran¬ 
dom fluctuations in productivity growth. Such exogenous variables 
could operate through product demand or through factor supplies. 
Lack of correlation with the random element of productivity growth 
involves two considerations: First, the instrument must not cause 
movements in productivity, and, second, it must not respond to ran¬ 
dom variations in productivity growth. 

It is a challenge to And instruments that are plainly exogenous 
under all views of macroeconomic fluctuations and that also have 
large enough influences on employment and output so that the test is 
powerful. Recent research has cast doubt on the exogeneity of all 
measures of monetary policy that are much correlated with output. 
On the fiscal side, only military spending is arguably unresponsive to 
the current state of employment and output. No single assumption is 
likely to appeal to all schools of thought about the relation between 
producdvity growth and output fluctuations. Hence, I will present 
results for a variety of instruments, suggested by Valerie Ramey. 

Military Spending 

Military spending undergoes occasional large fluctuations that do not 
appear to be driven by the business cycle or by fluctuations in produc¬ 
tivity. In addition, there is no reason to think that increases in govern¬ 
ment purchases of certain products should shift the production func¬ 
tions for the industries making those products, at least from one year 
to the next. Were military spending sufficiently correlated with em¬ 
ployment and output, it probably would be the most persuasive in¬ 
strument for the purposes of this paper. In addition to government 
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purchases of goods, which operate through product markets, changes 
in military employment help identify the equation through fluctua¬ 
tions transmitted via the labor market. 

The World Oil Price 

It is reasonable to assume that the historical pattern of shifts in the 
world price of oil has not been caused in any important way by fluctu¬ 
ations in U.S. productivity growth. The other part of the argument 
supporting the rate of change of the oil price as an instrument holds 
that shifts in oil prices do not cause changes in productivity. That 
hypothesis is more controversial. Its justification is that changes in 
factor prices do not shift production functions in the short run. 
Under this hypothesis, the observed tendency for measured produc¬ 
tivity to fall when oil prices rise is the result of the negative response 
of output to that rise. 

The Political Party of the President 

Systematic differences in economic policies of the two political parties 
have caused differences in rates of expansion of the industries consid¬ 
ered here, both over time and across industries. Outputs of services, 
durables, and regulated industries have risen noticeably faster under 
Democrats than under Republicans. Under the reasonable hypothesis 
that neither party has adopted policies that affect productivity growth 
in the short run, this systematic difference can be used to test the joint 
hypothesis of competition and constant returns. 

V. Econometric Method 

Under the bask identifying hypothesis that true shifts in productivity 
are unrelated to movements of the instrumental variable, testing of 
the joint hypothesis of competition and constant returns is a simple 
matter of testing the hypothesis that the covariance of the Solow 
residual Aq - aAn and the instrument Az is zero. To the extent that 
periods in which outside forces raise output are periods in which the 
actual growth of output exceeds the amount expected from observa¬ 
tions on the revenue share, a, applied to labor growth, An, the joint 
hypothesis is falsified. 

Although the test could be conducted with the raw covariance itself, 
an equivalent and more easily interpretable test is based on the re¬ 
gression coefficient of the Solow residual on the instrument. Thus the 
tests to be employed are t-tests for the exclusion of the instruments 
from the regressions. 
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The test does not assume that the markup ratio, p., is a constant. It 
is of interest, however, to gain some sense of the magnitude of the 
departure from competition. Estimates of ft based on the assumption 
of constancy are useful for this purpose. The estimate of ft obtained 
by applying instrumental estimation to equation (9) is 


cov(Ay, Ax) 
cov(aAn, Az)' 


( 22 ) 


This estimator suffers from a subtle defect. When overhead labor and 
labor hoarding are extreme, employment growth An is hardly cor¬ 
related with the instrument, even though output growth A q is highly 
correlated. The resulting estimate, (L, is a large number. Moreover, 
the variance of pi is large as well. Interpretation of the results is much 
enhanced by estimating the reciprocal, 1/p.. Theteciprocal maps the 
entire region of values of p greater than one ihtpthc interval from 
zero to one. The variance of the reciprocal is a ittttch more informa¬ 
tive measure of dispersion than the variance of ft itself. With a single 
instrument, the instrumental estimator of the reciprocal is just the 
reciprocal of equation (22). With more than one instrument (the two- 
stage least squares estimator), the results are not invariant to the 
normalization. The reciprocal of the estimate and the estimate of the 
reciprocal are not exactly the same, but in practice the differences are 
usually very small. 


VI. Data 

I have obtained annual data for seven one-digit industry groups and 
26 industries at roughly the two-digit level for the years 1953-84.* 
The industry detail is controlled by the labor input measure, which 
is an unpublished compilation of hours of work for all workers, in¬ 
cluding supervisory workers. The series are the following: Q: real 
value added, 1982 dollars (U.S. National Income and Product Ac¬ 
counts [NIPA]), K: net real capital stock (Bureau of Economic Analy¬ 
sis), p: implicit deflator with indirect business taxes removed (ratio 
of nominal value added less indirect business taxes to real value 
added), N: hours of work of all employees (NIPA), and w: total com¬ 
pensation divided by N. 

Note that the data are chosen to eliminate tax wedges as a source of 
departures of marginal cost from price. The price level is measured 
net of sales and other taxes, and the wage is measured gross of social 
security, fringes, and other costs incurred by the employer. The in- 

1 The data are available from the author on diskette, together with a complete de¬ 
scription of the sources. 
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dustries chosen were the most detailed for which the NIPA report 
hours of all employees. 

The instrumental variables are the rate of increase of the world 
price of crude petroleum in dollars, the rate of growth of military 
purchases of goods and services in real terms, and a dummy variable 
with the value of one when the president is a Democrat and zero when 
he is a Republican. 

VII. Results 

Nondurables 

Table 1 shows the construction of the Solow residual for nondurables. 
Figure 1 shows the evidence in the form relied on in this paper. The 
vertical axis plots the Solow residual, Aqr - aAn. With equality of price 
and marginal cost, the residual should be unaffected by exogenous 
shifts in demand that cause both output and employment to rise. If 
marginal cost falls well short of price, then the increase in output will 
exceed the corresponding increase in employment multiplied by the 
share, a, because a understates the elasticity of output with respect to 
labor input. Hence a positive relation between the Solow residual and 
an exogenous demand variable is evidence that marginal cost falls 
short of price. The horizontal axis of figure 1 is the negative of the 
rate of growth of military spending. There is a positive relation be¬ 
tween the Solow residual and the instrument. The explanation of¬ 
fered here is that the product wage understates the marginal product 
of labor; that is, price exceeds marginal cost. 

The formal test discussed in Section IV confirms the findings of 
figure 1. The relation between the Solow residual and the instrumen¬ 
tal variable, Az, is 

A q - aAn = .021 + .094Az, 

(.004) (.064) (23) 

standard error; 2.5%, Durbin-Watson statistic; 2.04. 

Much stronger evidence against the joint hypothesis of competition 
and constant returns appears when the rate of change of the world 
price of crude oil is the instrument: 

Af - aAn = .029 + .llOAz, 

(.005) (.042) (24) 

standard error: 2.1%, Durbin-Watson statistic: 1.88. 

On the other hand, the covariance of the Solow residual for nondu¬ 
rables and the political dummy variable is very close to zero; nondu¬ 
rables employment and output are hardly affected by the party in 
power. 
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TABLE 1 

Construction of the Solow Residual for Nondurables (Percentage Change) 


Year 

Output 

Growth 

Sq 

Hours 

Growth 

An 

Labor 

Share 

a 

Weighted 

Hours 

Growth 

aAn 

Solow 

Residual 

Aq - aAn 

1953 

1.31 

-.47 

.73 

-.34 

1.65 

1954 

-2.70 

-6.33 

.74 

-4.67 

1.97 

1955 

6.1! 

2.46 

.71 

1.75 

4.37 

1956 

.40 

-2.39 

.72 

-1.72 

2.12 

1957 

-2.57 

-4.84 

.74 

-3.60 

1.03 

1958 

-.85 

-5.04 

.74 

-3.74 

2.89 

1959 

9.10 

4.65 

.72 

3.34 

5.76 

1960 

-.58 

-1.90 

.73 

-1.39 

.80 

1961 

.41 

-2.66 

.73 

-1.95 

2.36 

1962 

3.43 

.05 

.73 

.03 

3.40 

1963 

4.56 

-2.38 

.72 

-1.71 

6.27 

1964 

1.35 

-2.15 

.71 

-1.53 

2.89 

1965 

-.77 

-2.96 

.70 

-2.08 

1.32 

1966 

-2.18 

-3.83 

.70 

-2.68 

.50 

1967 

-6.29 

-5.75 

.71 

-4.11 

-2.19 

1968 

1.11 

-2.91 

.71 

-2.07 

3.17 

1969 

-1.24 

-3.41 

.73 

-2.49 

1.26 

1970 

-4.42 

-7.84 

.74 

-5.77 

1.35 

1971 

1.14 

-4.95 

.73 

-3.60 

4.74 

1972 

4.21 

-.54 

.73 

-.39 

4.61 

1973 

5.70 

-.58 

.73 

-.42 

6.12 

1974 

-11.37 

-7.40 

.75 

-5.53 

-5.83 

1975 

-6.01 

-10.30 

.69 

-7.14 

1.13 

1976 

5.66 

1.37 

.69 

.95 

4.71 

1977 

2.96 

-1.24 

.69 

-.85 

5.82 

1978 

.03 

- 1.33 

.71 

-94 

.98 

1979 

-.96 

-2.85 

.72 

-2.05 

1.09 

1980 

-6.34 

-5.50 

.73 

-4.03 

-2.31 

1981 

.65 

-2.30 

.71 

-1.63 

2.28 

1982 

-1.60 

-7.52 

.69 

-5.22 

3.61 

1983 

4.41 

.91 

.68 

.62 

3.78 

1984 

.47 

1.20 

.67 

.81 

-.34 


Results for Seven Major Industry Groups 

Table 2 presents the test statistics for seven major U.S. industry 
groups. The industries cover all private GNP except for mining and 
agriculture, where the role of natural resources is sufficiently high 
that important measurement issues arise in applying Solow’s method. 
The table gives the marginal significance levels for a one-tailed t-test 
of the hypothesis of the exclusion of the instrument from a one- 
variable regression. The marginal significance level is the probability 
under the null hypothesis that the covariance would be at least as 
large as its observed value. Small values are evidence against the null 
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Fig. 1.—Solow residual for nondurables plotted against the rate of growth of military 
spending. 



hypothesis and in favor of the alternative hypothesis of market power 
and increasing returns. 

The rate of change of the world oil price provides the strongest 
evidence against competition. In five of the seven industries, the mar¬ 
ginal significance level is under 3 percent: the observed covariance of 
the Solow residual and the instrument is extremely unlikely to have 
occurred because of chance alone. In one other industry, services, the 
marginal significance level is below 10 percent, which constitutes rea¬ 
sonably strong evidence against the null hypothesis in that industry as 
well. 


TABLE 2 

Marginal Significance Levels for One-Digit Industries 


Industry 

Military 

Spending 

Oil 

Price 

Political 

Party 

Construction 

.327 

.003 

.090 

Durable goods 

.500 

.029 

.357 

Nondurable goods 

.076 

.001 

.256 

Transportation and public utilities 

.079 

.009 

.363 

Trade 

.270 

.002 

.499 

Finance, insurance, and real estate 

.121 

.271 

.198 

Services 

.193 

.082 

.043 


Non.—The table >hom ihe marginal significance levels for a one-tailed teal of (be hypothesis that the covariance 
of the Solow reaidual and the untrumem U pod live. The sign of the muniment it normaliaed so that hr covariance 
with output growth it positive. 
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Except in finance/insu rance/real estate, episodes of oil price in* 
creases saw large reductions in output along with smaller reductions 
in labor input Productivity fell dramatically. I believe that output and 
employment fell for some reason other than a downward shift in the 
production functions. I infer that the observed factor shares under¬ 
state the true elasticity of output with respect to labor because prices 
exceed marginal costs. 

The rate of change of military spending provides some evidence 
against the joint hypothesis of competition and constant returns in 
construction, nondurables (reviewed in detail in the last section), 
transportation, trade, finance/insurance/real estate, and services. It is 
interesting to note that military spending growth stimulates services 
but retards nondurables and construction. The raw covariances of 
military spending growth and the Solow residuals go in the same 
direction: military spending raises measured productivity in services 
and lowers it in nondurables and construction. In fact, the two 
covariances have the same sign in every industry except durables. 

The political dummy yields reasonably strong evidence against 
competition in two industries, construction and services. In both, hav¬ 
ing a Democrat in power stimulates activity and raises measured pro¬ 
ductivity. 

Results for Detailed Industries 

Table 3 shows similar results for 26 more detailed industries, mostly 
at the two-digit Standard Industrial Classification (SIC) level. Again, 
the oil price instrument generates the most conspicuous evidence 
against the joint hypothesis of competition and constant returns: nine 
of the industries have marginal significance levels below 5 percent. 
The other two instruments bring rejections in a number of other 
industries as well. 

I conclude that, under the assumptions stated at the outset of the 
paper and under the further assumption that all the variables are 
measured accurately, the evidence favors a certain amount of market 
power as against the hypothesis of pure competition. The response of 
productivity to outside events that themselves are unlikely to affect 
productivity or be affected by it cannot be explained under competi¬ 
tion but has a ready explanation through market power. 


Estimates of the Degree of Market Power under 
the Assumption of a Constant Ratio of Price 
to Marginal Cost 

As I noted in Section IV, the most convenient way to measure the 
magnitude of market power under the assumption of a constant 
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TABLES 

Marginal Significance Levels: Further Industry Detail 


Industry 

Military 

Spending 

Oil 

Price 

Political 

Party 

20: Food and kindred products 

.398 

.023 

.265 

21: Tobacco manufactures 

.087 

.231 

.366 

22: Textile mill products 

.253 

.082 

.170 

23: Apparel and other textile products 

.208 

.614 

.591 

24: Lumber and wood products 

.632 

.250 

.096 

25: Furniture and fixtures 

.043 

.063 

.447 

26: Paper and allied products 

.191 

.004 

.271 

27: Printing and publishing 

.068 

.081 

.500 

28: Chemicals and allied products 

.184 

.001 

.291 

29: Petroleum and coal products 

.053 

.001 

.425 

30: Rubber and miscellaneous plastic products 

.285 

.237 

.246 

31: Leather and leather products 

.141 

.494 

.161 

32: Stone, clay, and glass products 

.357 

.002 

.379 

33: Primary metal industries 

.155 

.341 

.221 

34: Fabricated metal products 

.265 

.092 

.304 

35: Machinery, except electrical 

.748 

.065 

.624 

36: Electric and electronic equipment 

.252 

.027 

.038 

38: Instruments and related products 

.478 

.723 

.362 

39: Miscellaneous manufacturing industries 

.452 

.144 

.075 

48: Communication 

.356 

.216 

.674 

49: Electric, gas, and sanitary services 

.440 

.208 

.202 

371: Motor vehicles and equipment 

.369 

.124 

.376 

372-79: Other transportation equipment 

.455 

.557 

.669 

Transportation 

.022 

.020 

.219 

Wholesale trade 

.270 

.002 

.283 

Retail trade 

.319 

.013 

.596 


Non —Set note to table 2. 


markup ratio, p, is to estimate the reciprocal, 1/p., which I will call {J. It 
has the value one under competition and falls short of one to the 
extent that price exceeds marginal cost. Table 4 gives estimates of p 
with their standard errors for the one-digit industry groups. These 
estimates make use of all three instruments together by using the two- 
stage least squares estimator. For all seven industries, the estimate of 
0 is significantly less than one, confirming the result of table 2 that the 
hypothesis of competition is rejected. 

The estimated values of the markup ratio, p, are shown in the last 
column of table 4. The interpretation of these estimates must heed 
the warnings of Section III with respect to the use of data on value 
added. The estimate of p measures the ratio of price less materials 
cost (the valued added deflator) to marginal cost excluding marginal 
materials cost. Such an estimate always overstates p*, the ratio of price 
to full marginal cost. The estimates of p in table 4 range from a little 
under 2 to a little under 4. That is, of the total value added per unit of 
sales, only 25-55 percent is marginal cost; the rest is earnings from 







94° 


JOURNAL OF POLITICAL ECONOMY 
TABLE 4 

Estimates of Makeup Ratio at One-Digit Level 


Industry 

Estimate of 
Reciprocal, 

P 

Durbin- 

Watson 

Statistic 

Markup 

Ratio, 

A 

Construction 

.455 

(.103) 

1.051 

2.196 

Durable goods 

.486 

(111) 

1.942 

2.058 

Nondurable goods 

.323 

(102) 

2.081 

3.096 

Transportation and public utilities 

.313 

(.119) 

1.570 

3.199 

Trade 

.264 

(.109) 

1.474 

3.791 

Finance, insurance, and real estate 

.303 

(167) 

1.734 

3.300 

Services 

.536 

(.187) 

1.662 

1.864 


Not*.—B ii the two-«uge lean x|i»m estimator of 1>>. with military spending, oil ptice. and the political dummy 
M instruments; (L is tu reciprocal. Standard errors are in parentheses 


market power. The deviations from invariance of the Solow produc¬ 
tivity residual documented in table 2 correspond to economically 
significant amounts of market power. 

Table 5 presents estimates of fj and p. for the more detailed indus¬ 
tries. Not every industry shows evidence of market power. For ex¬ 
ample, in apparel (SIC 23), 0 is slightly, but not significantly, greater 
than one. In three industries (petroleum, mining, and wholesale 
trade), the covariances of weighted employment growth, oAn, and 
output growth, A q, with the instrument (a linear combination of the 
military, oil, and political instruments) have opposite signs, which 
creates a problem of interpretation, in principle. However, in all 
three, the covariance of the instrument with output growth is robustly 
nonzero and the covariance with employment growth is very close to 
zero. The most reasonable interpretation is that overhead labor and 
labor hoarding are important in these industries, which supports the 
conclusion that they are not competitive. 


Subsequent Research 

The results reported here are strongly confirmed by subsequent re¬ 
search in the framework developed here carried out by Domowitz, 
Hubbard, and Petersen (1988). They have a rich body of data on 
extremely detailed industries. The data report gross output and ma¬ 
terial inputs so that it is not necessary to work with value added. By 
pooling industries within two-digit categories, Domowitz et al. are 






TABLE 5 

Estimates of Makeup Ratio: Further Industry Detail 


Industry 

Estimate of 
Reciprocal, 

$ 

Durbin* 

Watson 

Statistic 

Markup 

Ratio, 

A 

20: Food and kindred products 

.189 

(.144) 

1.301 

5.291 

21 : Tobacco manufactures 

.362 

(.193) 

1.476 

2.766 

22: Textile mill products 

.388 

(.160) 

2.384 

2.578 

23: Apparel and other textile 
products 

1.213 

(.592) 

1.911 

.824 

24: Lumber and wood products 

.555 

(.223) 

2.013 

1.801 

25: Furniture and fixtures 

.506 

(.118) 

1.990 

1.977 

26: Paper and allied products 

.269 

(.060) 

1.948 

3.716 

27: Printing and publishing 

.070 

(294) 

.961 

14.263 

28: Chemicals and allied products 

.050 

(.067) 

1.821 

20.112 

29: Petroleum and coal products 

-.007 

(.122) 

1.432 

-139.478 

30: Rubber and miscellaneous 
plastic products 

.663 

(249) 

2.036 

1.508 

31: Leather and leather products 

.476 

(.337) 

2.086 

2.100 

32: Stone, clay, and glass 
products 

.394 

(.090) 

1.885 

2.536 

33: Primary metal industries 

.460 

(.100) 

2.374 

2.172 

34: Fabricated metal products 

.607 

(.232) 

2.478 

1.649 

35: Machinery, except electrical 

.700 

(.265) 

.992 

1.429 

36: Electric and electronic 
equipment 

.324 

(.175) 

2.284 

3.086 

38: Instruments and related 
products 

.716 

(.540) 

2.558 

1.397 

39: Miscellaneous manufacturing 
industries 

.223 

(.130) 

1.888 

4.491 

48: Communication 

.028 

(.998) 

1.764 

36.313 

49: Electric, gas, and sanitary 
services 

.079 

(.290) 

.389 

12.591 

371: Motor vehicles and 
equipment 

.567 

(.191) 

3.218 

1.763 

372-79: Other transportation 
equipment 

1.053 

(413) 

1.679 

.095 

Transportation 

.251 

(.196) 

2.743 

3.976 

Wholesale trade 

-.271 

(.366) 

1.076 

-3.688 

Retail trade 

.425 

(.109) 

2.253 

2.355 


Non—See note to table 4. 
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able to achieve much greater power than the tests of this paper. They 
find extremely strong rejection of competition in most manufactu ring 
industries. 

Shapiro (1987), using data similar to those of this paper, extends 
this framework to estimate the elasticity of market demand jointly 
with the ratio of price to marginal cost. He confirms the basic finding 
of market power in numerous industries. 


VIII. Possible Specification Errors 

The basic empirical finding of this paper is that expansions in re¬ 
sponse to outside forces involve a much larger increase in output than 
what would be expected from the observed increase in labor input, on 
the basis of the use of labor’s share as an estimate of the elasticity of 
output with respect to labor input. I offer the interpretation of this 
finding that the share understates the true elasticity because price 
exceeds marginal cost. However, the empirical finding can also be 
explained in a competitive setting through one or a combination of 
specification errors. For a much more detailed analysis of specifi¬ 
cation errors in this setting, see Hall (1987). 


Variations in Work Effort and in Hours 

Suppose that employees put in more work effort per hour when 
output and employment are higher. Suppose further that there is a 
disamenity of work effort that employers perceive as a cost. Then the 
method of this paper is biased toward rejecting competition. The 
omission of work effort from aAn understates its value when it and Aq 
are positive and makes the residual larger than it should be. The 
residual will be positively correlated with an instrument that causes 
increases in output. Note that variations in effort that are costless to 
the firm (because there is no disamenity to the worker or because the 
disamenity is not passed on to the firm) do not cause any bias. It can 
be shown (Hall 1987) that the fluctuations in unobserved effort 
needed to rationalize all the correlation of the residual with the in¬ 
struments are substantial, with fluctuations as large as 10 percent 
above normal in years of high output. Moreover, it is clear that work¬ 
ers are not compensated on a current basis for their increased effort. 
As it is, compensation per hour hardly changes when output changes. 
If times of high output are also times of high effort per hour and if 
compensation is paid for work in efficiency units, the wage per 
efficiency unit declines substantially when output rises. Thus unob¬ 
served fluctuations in work effort bias the test only if employers per- 
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ceive added effort as costly but do not pay for the effort on a current 
basis. 

Fluctuations in unmeasured work effort should not be taken for 
granted. Fay and Medoff (1985) found from a survey of employers 
that effort is slighdy negatively correlated with output, not strongly 
positively correlated. 

Measurement errors in hours of work could also explain the 
findings in a competitive setting if the errors are sufficiently nega¬ 
tively correlated with movements of output. Purely random errors in 
An, uncorrelated with the instrumental variable, do not bias the 
covariance. However, it is easy to think of reasons why the error 
would be negatively correlated. For example, suppose that some 
workers always report 40 hours of work per week even though they 
work more hours when demand is strong and fewer when it is weak, 
and employers perceive these extra hours as costly. 

There are two important reasons to discount measurement errors 
in hours. First, the great bulk of variations in total hours arises from 
changes in the number of employees, not in the number of hours per 
employee. There is no obvious explanation for a negative correlation 
in errors in employee counts with the instruments. Second, the data 
used in this study take advantage of all available data on actual hours 
of work; the data are obtained from employers’ payroll records for 
workers paid by the hour, and on hours reported by salaried workers 
in the Current Population Survey. 

Cyclical Errors in Unrecorded Output 

The hoarding of labor during cyclical contractions is probably an 
important element of the explanation of cyclical fluctuations in pro¬ 
ductivity. As I noted at the outset, the method of this paper properly 
adjusts for labor hoarding if it occurs in a competitive industry. How¬ 
ever, labor hoarding could bring about a measurement error that 
would cause the method of this paper to overstate the extent to which 
competition fails to hold. Specifically, hoarded workers may be put to 
work on projects other than the production of measured output. 
They may repair equipment, build new facilities, train themselves or 
others, and engage in many other investment activities. Though the 
N1PA data attempt in principle to include these items in output, 
many are no doubt unmeasured. Fay and Medoff (1985) found that 
the increase in investment activities of workers idled by a slump was 
sufficient to explain an estimated markup ratio of no more than 1.1, 
far below the estimates for many industries found in this paper. 8 

* For details of this calculation, see Hall (1987). 
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Errors in Measuring Capital 

Errors in measuring capital input sufficiently correlated with the in¬ 
strument could cause the false rejection of competition. The domi¬ 
nant source of cyclical measurement errors is likely to be the differ¬ 
ence between capital in use and capital available. The first is required 
by die theory, but the second is what is actually used in the calcula¬ 
tions of this paper. The difference matters if the shadow value of 
capital remains positive in episodes in which firms are not using all 
their available capital. If some capital has gone out of use because it is 
redundant, there is no bias in the type of test used in this paper. 

Capital will be taken out of use even when it has a positive shadow 
value if there is a pure user cost of capital, that is, a wearing-out cost 
avoidable by taking capital out of use. But even if the fraction of 
capital taken out of use in a slump is equal to the proportional decline 
in hours of work, the resulting estimate of the markup ratio in a 
competitive industry is only the reciprocal of labor’s observed share, 
well below what is found in quite a few industries. 

Capital can just as well have a negative shadow value in episodes 
when not all available capital is in use. Overhead labor technologies 
typically have this feature. When output falls below capacity, firms 
have to keep staffing their redundant capital with expensive overhead 
labor and would be better off if they could junk some of their capital 
temporarily. If such a firm does succeed in idling part of its capital, 
but the remainder still has a negative shadow value in a slump, then 
the bias will be toward, not away from, a finding of competition. 


Cyclical Errors in Measuring Labor’s Share 

Errors in measuring the value of a that are correlated with the instru¬ 
ment but do not affect the mean value of a are benign in this frame¬ 
work. Examples of measurement errors with this character are (1) 
payment of workers under wage-smoothing arrangements, where the 
wage equals the long-run opportunity cost of time but does not track 
short-run fluctuations in labor market conditions; (2) adjustment 
costs in employment, where the full marginal cost of incremental 
hours of work fluctuates above and below the observed wage; and (3) 
price rigidity, where prices are set at the long-run average of marginal 
cost. 

Under wage smoothing, workers receive less than their marginal 
products in good times and more in bad times. Hence the share a is 
understated %t good times and overstated in bad times. When the 
instrument ispositive and consequently output growth and employ¬ 
ment growth are also positive, the Solow residual measured with too 
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small an a is also positive. A positive term enters the covariance of the 
instrument and the residual. On the other hand, when the instru¬ 
ment, output growth, and employment growth are all negative, the 
Solow residual is positive as well: the term -aAn is overstated in a 
positive direction because An is negative. A negative term enters the 
covariance of the instrument and the residual. In data with an ap¬ 
proximately equal mixture of good and bad times, the covariance will 
turn out to be zero. That is, a competitive industry with wage smooth¬ 
ing will not generate data that reject the invariance property. 

Adjustment costs in employment have the same character as wage 
smoothing. Half the time, the shadow cost of labor to the firm exceeds 
the wage, and the measured value of a understates the true value. 
These are times when the current change is in a direction that adds to 
adjustment costs. The other half of the time, the shadow cost of labor 
falls short of the wage because the current change in employment 
conserves adjustment costs. There is no bias in labor’s share, a, in the 
long run, but there are measurement errors correlated with the in¬ 
strument. But the errors cancel out, and there is no reason to expect 
to find a correlation of the Solow residual and the instrument in a 
competitive industry with labor adjustment costs. 

Price rigidity could arise in a competitive industry if firms find it 
necessary to post prices before observing current demand. If firms 
stand ready to serve all demand, then the same type of symmetry 
prevails as that described earlier. When demand is strong, a is over¬ 
stated because the pQ in the denominator understates the true value 
of output based on marginal cost. When demand is weak, a under¬ 
states Labor’s share in marginal cost. But there is no resulting correla¬ 
tion of the Solow residual and an instrument correlated with demand. 
Rotemberg and Summers (1987) examine this case in more detail. 
They show that a positive covariance of the Solow residual and an 
instrument would arise if the firm’s behavior is asymmetric, serving all 
demand in the low-demand states but rationing output if marginal 
cost exceeds the predetermined price. 


IX. Interpretation and Conclusions 

The basic fact found in this paper is neither new nor surprising. 
When output rises, firms sell the output for considerably more than 
they pay for the incremental inputs. Most economists have been con¬ 
tent to invoke the idea of cyclical fluctuations in productivity in think¬ 
ing about this fact. My point in this paper is that the fact may involve a 
dramatic failure of the principle that marginal cost is equated to price. 
Marginal cost is literally the increase in the cost of inputs needed to 
produce added output. That increase is small, so marginal cost is 



946 JOURNAL OF POLITICAL ECONOMY 

small. When it is compared to price, a large gap is found in many 
industries. The most obvious explanation of the finding of price far in 
excess of marginal cost is monopoly power in the product market. 
Since few American firms are simple monopolies, the finding proba¬ 
bly requires an interpretation in terms of theories of oligopoly and 
product differentiation. Then the finding lends strong support to the 
view that these theories are more realistic than the simple theory of 
competition. 

Departures from competition in the product market are not the 
only potential explanation of the finding of this paper. Monopsony in 
input markets is another possibility. For example, a monopsonist in 
the labor market faces a marginal cost of labor in excess of the wage it 
pays. In principle, a firm with sufficient monopsony power in the 
labor market but facing competitive conditions in its product market 
could have its price equal to its actual marginal cost, but well above the 
level inferred from the quoted wage in my calculations. However, 1 
am not aware of any reason to think that monopsony in input markets 
is anywhere near pervasive enough to explain the findings. On the 
other hand, simple monopoly or more complicated types of monop¬ 
oly power in labor or other input markets have no role in explaining 
the finding. In the labor market, all that is needed for my purposes is 
that the measured wage is the actual incremental cost of labor. 
Broader efficiency issues will rest on the question of whether the wage 
correctly values the forgone time of workers, but the narrow hy¬ 
pothesis that the firm is a price taker in input markets is all that is 
needed for measuring the price/marginal cost ratio. 

All the findings of this paper can be interpreted as revealing imper¬ 
fectly competitive markets only within the basic identifying hypoth¬ 
eses set forth at the beginning of the paper. The instruments are not 
causes of productivity shifts, nor are they influenced by those shifts. 
Labor input is measured reasonably accurately, including its effort as 
well as its hours dimension. Capital input is reasonably accurately 
measured, or its user cost is sufficiently low that measurement errors 
are irrelevant. I consider all these to be reasonable hypotheses. Conse¬ 
quently, I find the evidence against pure competition reasonably con¬ 
vincing. 
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Physicians' Services and the Division of Labor 
across Local Markets 


James R. Baumgardner 
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This paper reports empirical evidence of systematic cross-locale vari¬ 
ation in the degree of division of labor among physicians. A theoret¬ 
ical model—based on an individual producer’s trade-off between 
increasing returns and falling marginal revenue within each activity 
—motivates the empirical tests. At two levels of aggregation, special¬ 
ization is correlated with local demand shifters for medkal services. 
At the individual level, I find systematic differences in the range of 
procedures performed within a specialty class. General practitioners 
working fewer hours, practicing in more populated counties, or 
practicing in counties with more elderly produce a narrower range 
of procedures. 


I. Introduction 

The empirical observation that the number of activities bundled into 
one person’s job varies with market (demand) conditions dates to 
Adam Smith’s (1776) classic work. This paper presents evidence of 
variations in the degree of the division of labor among physicians 
across geographically local markets. 

I thank all who have contributed helpful comments and discussions. Special thanks 
go to Gary S. Becker, David Dranove, Michael Grossman, Boyan Jovanovic, William D. 
Marder, Roger A. Reynolds, Sherwin Rosen, and an anonymous referee. The Center 
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University Consortium for Political and Social Research. Related empirical work ap¬ 
pearing in Baumgardner (1986) benefited from the assistance of Emil Bcrendt, Ted 
Joyce, and Mary R. McCarthy. Thanks to F. L. Smith for typing. Responsibility for all 
views and errors rests with the author. 
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In earlier work (Baumgardner 1986, 1988), 1 proposed a theoreti¬ 
cal model that explained and predicted cross-market variations in the 
degree of the division of labor within service industries. That model 
relies on a trade-off between increasing returns production of each 
activity for an individual worker-producer and falling marginal reve¬ 
nue with output of each activity. 

Geographically local markets are assumed to be relevant because of 
transportation costs and because the provision of medical services 
requires the doctor and patient to meet in the same place. While this 
assumption may not be strictly true in all cases, I take the position that 
it does capture an important element in explaining the local variations 
in physician specialization that are observed. At a theoretical level, the 
assumption that a doctor can serve only patients living in the prac¬ 
titioner’s local market allows formulation of a tractable model with 
empirical content. At an applied level, the local market assumption 
suggests that variables affecting local demand for medical services will 
influence the range of activities performed by a local doctor. 

Throughout the text the reader should understand the use of the 
term “specialization.” In the context of this model, “specialization” 
refers to the degree to which a doctor has narrowed his practice 
activities to a small set of patient problems or medical procedures. A 
doctor who treats a relatively small number of diseases or performs a 
relatively small number of procedures is more specialized than a doc¬ 
tor performing a wider set of activities. I stress the meaning of 
specialization to avoid confusion with the use of the term to describe 
years of formal postgraduate education. I am concerned with the 
range of activities performed in practice. 

I will present empirical evidence on variations in specialization at 
two levels of aggregation using two different types of measures of 
specialization. The “aggregate" level will focus on the degree of 
specialization of the representative doctor in a county, with the county 
as the unit of observation. The number of doctors in various specialty 
categories serves as a basis for measuring the degree of specialization 
of the representative doctor. The “micro” level takes an individual 
physician as the unit of observation and uses information on proce¬ 
dures performed in the previous month to form a measure of the 
doctor’s degree of specialization. Specialization is measured by a 
Herfindahl index based on procedures performed. The micro evi¬ 
dence looks at differences in specialization at a finer level than the 
aggregate evidence since the former includes only doctors within a 
single specialty class. 

The combination of evidence from both levels of data reinforces 
the major finding that the degree of specialization depends on vari¬ 
ables affecting local market demand for medical services. The most 
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important of these demand shifters is the local population. Age disf 
button and education variables also appear to affect specializatit 
while the influence of per capita income is negligible. Examination 
residuals indicates that markets with more doctors than expected 
the basis of observable demand shifters also have more specializ 
doctors than predicted. Theoretical considerations suggest that be 
cooperation and noncooperation among local physicians are cons 
tent with this empirical finding. 

For a more complete treatment of the theoretical model and relat 
references, see Baumgardner (1988). Related works using activ' 
specific human capital as a source for gains from specialization 
Becker (1981, 1985), Gros (1983), Rosen (1983), and Barzel and 
(1984). 

In the medical economics literature, Newhouse et al. (1982) inves 
gate differences in specialist-population ratios and general pn 
titioner-population ratios as a function of the local population. Th< 
model assumes that specialists are the unique producers of cert.' 
services and will automatically provide other services that are pt 
vided by generalists. Since there is only sufficient demand for t 
special services in larger cities, Newhouse et al. argue that speciali 
locate first in large cities, and since they also provide the services 
generalists, generalists get squeezed out to smaller towns. 1 ma 
assumptions at a much more primitive level and view the range 
activities produced by an individual as endogenous. 

An important difference between my view and that of Newhouse 
al. concerns the mapping between a doctor’s specialty dassificatu 
(e.g., family practitioner or obstetrician-gynecologist) and the ran 
of activities he performs. In the micro-level work, I present evider 
that there exist differences in the range of activities performed with 
a specialty class and that the range varies in a systematic way with lot 
demand shifters. Such within-specialty differences are ignored by tl 
Newhouse et al. model (and by my aggregate-level analysis), whi 
assumes that the procedures performed by doctors within a particul 
specialty class (such as general and family practitioners) are invaria 
to local demand variables. 1 

Boulier and Wilson (1982) find specialization increasing in popul 

* I must point out that the micro findings conflict with a working assumption made 
the aggregate-level work. At the aggregate level, I assume that a doctor in a particu 
specialty class performs the same range of activities independent of local dema 
characteristics. This invariance assumption performs a role in the work here differ 
from its role in Newhouse et al. (1982). I use the assumption (in the aggrcgate-le 
analysis) in order to justify use of specialty classification as an empirical measure of I 
range of procedures performed or diseases treated. On the other hand, Newhouse 
al. use the assumption to drive their theoretical prediction that general practitioni 
will be forced out of densely populated areas. 



PHYSICIAN SERVICES 


95* 

tion and per capita income. Their study is limited to measures of 
specialization at the state level using the proportions of physicians in 
various specialty classes to form measures of specialization. 

I proceed with a theoretical discussion and an outline of the empir¬ 
ical strategy in Section 11. Section III describes the data and presents 
the empirical results. Concluding comments appear in Section IV. 1 
have two goals: (1) to test the implications of the theory and (2) to 
obtain a statistical description of variations in the degree of the divi¬ 
sion of labor as a function of local demand-shifting variables, of sup¬ 
ply shifters of the number of local physicians, and of an individual’s 
productive endowments. 


II. Theoretical Considerations and Empirical 
Strategy 

In this section I integrate a brief theoretical discussion with a set of 
regression equations to be estimated in Section III. No attempt is 
made to fully develop the theoretical model. Interested readers are 
referred to Baumgardner (1988). 

I want to explain variations in the range of different medical activi¬ 
ties provided by an individual doctor. The number of different activi¬ 
ties provided by an individual may depend on various characteristics 
of the local market and the individual’s productive endowments. The 
economy is abstractly viewed as a set of local markets with each doctor 
confined to practice in only one of these markets. The markets differ 
in variables that are exogenous to the model. For the purposes of this 
paper, the important variables are those that shift the market demand 
curves for the entire spectrum of potentially provided medical activi¬ 
ties. To the extent that physicians value nonpecuniary aspects of their 
local markets, measures of these amenities are also relevant. 

Doctors face two types of decisions: choice of a local market and 
choice of degree of generalization (the number of different medical 
activities to provide). 2 In equilibrium, if underlying differences in 
abilities are accounted for, income-maximizing doctors will distribute 
across markets so that physician income is equalized across all markets 
that have a doctor. Positive demand shifters for medical activities, 

* For purposes of theoretical abstraction, I view the choices as occurring simulta¬ 
neously. One may also view the decision on range of activities as one made early in a 
doctor’s career, but made with a view to the expected characteristics of a local market in 
which the doctor will settle. Alternatively, the local market may be chosen on the basis 
of the number of activities the doctor has learned to produce. Either way, it is as if the 
doctor has made a joint choice of location and generalization. Also, a doctor can make 
choices during his career to become more or less specialized. One expects such changes 
if initial choices turn out to be ex post nonoptimal given the doctor's location decision 
and if costs of relocating exceed those of changing one’s degree of specialization. 
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such as local population, will be associated with a greater local number 
of doctors (N). To the extent that doctors care about local amenities 
(Z) as well as income (Y), amenities will also increase N. That is, 
equilibrium will have U - U(Yt, Zj) in all markets that have at least one 
practicing physician, where D is the equilibrium level of physician 
utility, and the l subscript refers to a local market. In equilibrium, 
more amenities will carry a compensating reduction in Y. 

These considerations give the regression equations 

N * o'X + e, (1) 

Y = |'X + 9'T + d, (2) 

where X is a vector of demand shifters affecting the local market’s 
demands for medical activities, T is a vector of personal endowments, 
and e and d are residuals. If local amenities are relevant, they should 
also be included in (1) and (2), with positive coefficients expected in 
(1) and negative coefficients in (2). The coefficient vector | is ex¬ 
pected to equal 0 in equilibrium. 

The other decision for a doctor is the choice of number of activities. 
In theory, the set of all medical activities is represented abstractly by a 
segment of length one. The different medical activities are indexed 
by s € [0, I]. With A defined as the set of activities that an individual 
doctor chooses to perform, 8 * (A) refers to the number of activities 
the doctor chooses to produce; 8 can be conveniently interpreted as 
an index of generalization. 4 

Each doctor has fixed endowments of productive factors that must 
be allocated along two margins, an extensive margin and an intensive 
margin. The endowment constraints are 

Taj t(s)ds, H St | cm(s)ds, (3) 

where T is the endowment of work time, t{s) is time spent producing 
activity s, H is the endowment of human capital that must be allocated 
to skills specific to the production of specific activities, m(s) refers to 
the amount of specific skill used in producing activity s, and c is the 
resource cost of a unit of skill per activity. The extensive decision 
concents the number of activities to produce, |A|. The intensive deci¬ 
sion refers to the quantities of inputs (m and t) to (and thus outputs 
of) each produced activity. Given his productive endowment, the phy¬ 
sician can provide low input levels to many activities or high input 
levels to few activities. Gains from specialization come from an in- 

* Given one’s decision of a practice locale, the doctor will organize his practice activi¬ 
ties to maximize Y. 1 abstract from labor-leisure choice throughout the discussion. 

* We wiU not be concerned with which particular medical activities a particular doctor 
chooses. Our interest is the degree of generalization, or number of activities, chosen. 
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creasing returns technology within each activity, 

q(s) = m(s)i(s) f (4) 

where q(s) is output of activity s. Assume the same production tech¬ 
nology (4) for each activity. When (3) is coupled with (4), a decrease in 
the number of activities produced allows a greater than proportional 
increase in output per produced activity. 5 

Table 1 collects implications of the theory to be tested in Section 
III. The implications are divided into two types: (1) reduced form 
and (2) structural. The terminology ‘‘reduced form” indicates that we 
are testing across-market equilibrium implications of the model in 
which both degree of specialization (S) and number of doctors (N) are 
endogenous to each local market. Reduced-form effects of local de¬ 
mand shifters (X) on 5 incorporate the fact that local demand shifters 
also affect N . Structural implications separately assess the effects of X 
and N on S. Differences in underlying assumptions about the compet¬ 
itive structure within markets generate different implications. I will 
briefly discuss three of these assumptions in turn. 


A. Pure Price Taking within Activities 

If an individual doctor viewed prices for each activity as independent 
of output level— p(q(s)) - p for all s —he or she would 

max Y - [ pq(s)ds = 8 pq (5) 

8 JA 

subject to (3) and (4). The production trade-off induced by the 
within-activity increasing returns would cause him to specialize into a 
single activity. 6 If doctors behaved in this fashion in all local markets, 
there would be no systematic variation in degree of specialization 
across markets. In table 1, I refer to this as the case of pure price 
taking within activities. 7 

s Technology (4) has the intuitive appeal that activity-specific human capital is re¬ 
quired for production, and for a given level of activity-specific skill, there are constant 
returns in specific time. See Becker (1981,1985), Gros (1983), Rosen (1983), and Barrel 
and Yu (1984) for applications of this approach. Economies of scale may also arise for 
reasons given by Smith (1776), such as economizing on setup costs or increased dexter¬ 
ity with scale. See also Edwards and Starr (1987). 

* Since the set of activities is modeled as a continuous interval, there is actually no 
interior maximum, 5 6 (0,1), that satisfies (5). Maximization implies that the individual 
would collapse his practice activities toward a singleton. This is interpreted as complete 
specialization. See Rosen (1983) for the analogous result in a world of two discrete 
activities. 

Some sort of strong complementarity in the production of several activities may 
cause the doctor to produce the entire group of complementary activities. The essential 
point is that under pure price taking within each activity there is no reason for varia¬ 
tions in degree of specialization across markets. The complementarity case under price 
fcklng »treated in Rosen (1983). 
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B. Downward-sloped Local Market Demand 
within Activities 

The implications of two other forms of the theoretical model appear 
in the final two columns of table 1. In both of these cases, the produc¬ 
tion-side gains from specialization are countered by demand-side 
losses. Assume that doctors perceive downward-sloped local market 
demand for each activity: 

P(s) = p(Q(s), X), Pq < 0, (6) 

where p(s) is the local demand price for activity s, and Q(j) is the local 
market quantity. Assume that demand is the same for all activities 
within a given local market and that marginal revenue falls faster than 
demand price. 8 With falling local market demand and marginal reve¬ 
nue for each activity, increases in specialization may drive marginal 
revenue so low that a doctor may prefer less specialization and the 
higher prices that result. An individual’s optimal choice of degree of 
specialization balances the specialization gains from the within-acdvity 
increasing returns against the losses from within-activity declining 
marginal revenue. The income-maximizing choice of degree of 
specialization for a doctor who can behave as a local monopolist de¬ 
pends on the local values of the demand shifters. Ceteris paribus, a 
greater local population implies greater specialization by the monopo¬ 
list since market demand and marginal revenue in each activity are 
less confining in a market containing more consumers. 

1. Cooperation 

Column 2 in table 1 reports implications of the theory under the 
assumption that the doctors within each local market organize to max¬ 
imize revenue per doctor within the local cooperative. Assume that 
the local cooperative cannot prevent immigration of doctors and 
therefore takes N as given. Because of increasing returns production 
at the individual level, the cooperative will segregate the doctors. Each 
will produce 1 IN of the medical activities. It is easily seen that any 
overlap of doctors in the production of activities will be inefficient. 
Segregating the doctors will allow more output of some activities with 
no less production of other activities than is possible when overlap 

8 The former assumption is for trac lability, and the latter ensures second-order 
conditions. Also, readers should not confuse this model with differentiated products 
models in which activities closer to each other on the activities segment are closer 
substitutes in demand. In this model, the products on the locus are different products 
that have independent demands. The actual positions on the segment are irrelevant in 
this streamlined model. Our interest is in the length of the segment (or number of 
different activities) produced by a single doctor. 
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occurs. 9 The cooperative need not fix prices or quantities—practice 
that usually come to mind in discussions of cartel behavior. Th< 
cooperative needs only to enforce the division of activities across ttw 
doctors. Given the division of activities, individual maximization ! 
consistent with the cooperative’s objective. 

The implication that under cooperation more doctors, with de 
mand variables constant, leads to increased specialization follow! 
since the activities segment must be divided among more practition 
ers. Greater population (N constant) can lead to greater specializatior 
if N is small enough that the individual doctors can behave as if the) 
are individual monopolists. 

The reduced-form implications under cooperation incorporate th< 
fact that AT depends on X. In equilibrium, more populated local mar¬ 
kets will have greater N and therefore greater specialization (S). Fur¬ 
thermore, any other variables that increase the equilibrium numbei 
of physicians in a locale will be accompanied by a greater degree ol 
specialization within that market. 

2. Noncooperation 

A Cournot type of imperfect competition between local doctors give, 
implications different from either the pure price-taking or coopera¬ 
tive forms of the model. The individual chooses a subset of the med¬ 
ical activities to produce as well as a quantity of each produced activity 
to maximize individual income subject to (S), (4), and (6), taking as 
given the sets of activities and quantities per activity produced by the 
other local doctors. We shall focus on the implications of a symmetric 
Nash equilibrium in each local market. 

One characteristic of this noncooperative case is that sets of activi¬ 
ties may be produced by several local producers. Unlike the coopera¬ 
tive case, this overlapping of local doctors can occur for the following 
reason. In calculating one’s marginal revenue in the production of a 
particular activity under the Cournot conjecture, an individual does 
not take into account the effect of increased own output on the reve¬ 
nue received by other producers of that activity. This failure to inter¬ 
nalize leads to a breakdown of the segregated solution that we saw 
under cooperation. 

* The cooperative form of organization may lead to unused capacity to the extent 
that full use of individual endowments would drive market marginal revenue per 
activity negative. Cooperation gives the greatest possible output from the employed 
resources; however, it may be in the cartel's interest to leave some productive endow¬ 
ments unused. Also, note that the 1 IN cooperative solution assumes that the locale has 
too many donors for each to behave as an individual monopolist. 
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With N held constant, say the local population increases. Market 
demand for each activity swings out, and under the Cournot conjee* 
ture, the residual demand curve faced by an individual doctor also 
swings outward. Likewise, the individual producer’s marginal reve¬ 
nue curves swing out. The constraint of falling marginal revenue in 
each activity becomes less severe, and the individual doctor responds 
by specializing in a narrower set of activities. 

Unlike the cooperative case, an increase in N (with market demand 
variables held constant) lias an ambiguous effect on the degree of the 
division of labor. The intuition that can lead to a decrease in speciali¬ 
zation when N increases is that the increase in the number of local 
doctors generally leads to a greater amount of overlap on the activities 
segment. Within any given activity, the residual demand curve and 
marginal revenue curve facing any one of the producers is shifted 
inward. The individual physician responds to the more restrictive 
individual marginal revenue curve by generalizing across more activi¬ 
ties. 10 

The reduced-form relationship between individual degree of spe¬ 
cialization and demand shifters is produced by the possibly opposing 
forces sketched in the previous two paragraphs. For example, con¬ 
sider two locales differing in population. The difference in market 
and individual marginal revenue curves across the two markets is a 
force leading to more specialization in the more populated locale. 
The equilibrium requirement of equal income (or utility) per doctor 
across locales implies more physicians in more populated areas. A 
greater number of local producers is a force that can lead to less 
specialization in the more populated locale. The equilibrium correla¬ 
tion depends on the net effect. 

C. Reduced Form: Estimation 

The discussion above motivates the estimation of reduced-form equa¬ 
tions for N and local degree of specialization (5). Local number of 
doctors was discussed earlier (eq. [1]). In equilibrium, specialization 
will depend on local demand shifters for medical activities, 

S - + u. (7) 


10 The following example illustrates how an increase in N can lead to more specializa¬ 
tion. Consider a situation in which noncooperation and cooperation are consistent with 
each other and all producers produce l/N of the activities. If N is increased, each 
individual producer may find it more profitable to specialize further and increase 
intensive output rather than produce some activities that are also produced by someone 
else. 
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These demand variables include local population, age distribution, 
income, and education levels, 11 The reduced-form implications for 
(7) are summarized in row 1 of table 1. 

The micro data will allow investigation of effects of individual pro¬ 
ductive endowments on individual specialization. Equation (8) will be 
estimated, where T is a vector of variables measuring an individual 
doctor’s productive capacity: 

S = 0'X + y'T + r. (8) 

D. Separate Effects of Demand Variables and Number 
of Doctors: Estimation 

As shown in row 2 of table 1, cooperation suggests that 5 rises in N, 
whereas Cournot noncooperation allows (but does not necessarily im¬ 
ply) the opposite. Two approaches will be used to infer the separate 
effects of demand shifters and local number of physicians on the local 
degree of specialization. The first approach uses a two-stage proce¬ 
dure to estimate 

S * V* + 4AT + v. (9) 

A two-stage approach is used to instrument N since one expects corre¬ 
lation between AT and v to the extent that omitted demand shifters for 
medical services are captured in the residual. The first-stage regres¬ 
sion is 


N - *'X + k'Z + w. (10) 

If income maximization closely approximates physicians’ objectives or 
if the measures of Z (the local amenities) are inadequate, the two-stage 
approach will yield imprecise estimates of the parameters in (9). 

The second approach examines the correlation between residuals 
of equations (1) and (7). Theoretical considerations imply that unob¬ 
servables belonging in X and/or Z may induce nonzero correlations 
between the residuals of the reduced-form equations. Say, for ex¬ 
ample, that the residual in the number of physicians equation (1) 
contains effects of amenities or other supply shifters. Under coopera¬ 
tion, locales with higher values of these supply shifters will have 
higher than expected degrees of specialization. That is, the residuals 
of equations (1) and (7) will be positively correlated. Noncooperation 
can, but will not necessarily, give a negative correlation since more 
doctors due to supply shifters can give decreased specialization. 

11 In regressions of the form of eq. (7) we are abstracting from amenities as determi¬ 
nants of N. Amenities will be brought in when we focus on the separate effects of X and 
N on S. 



PHYSICIAN SERVICES 


959 

Consider another possibility in which the residual in equation (1) 
contains effects of unobservables shifting local demand for medical 
services. Then, under cooperation, the residuals of equations (1) and 
(7) are positively correlated, whereas under noncooperation the cor¬ 
relation may run in either direction. Pure price taking within activities 
implies no correlation between the residuals since specialization is 
independent of any variables (observed or unobserved) affecting N. 

To summarize the implications of the residual test, nonzero corre¬ 
lation is inconsistent with pure price taking, and a negative correla¬ 
tion is inconsistent with cooperation. A positive correlation rejects the 
pure price-taking form but is consistent with both the cooperative and 
noncooperative forms. 


III. Results 

In this section, I present the empirical results of the regressions and 
tests outlined in Section 11. Subsection A presents results using de¬ 
gree of specialization variables for each county. A county’s specializa¬ 
tion is based on counts of physicians in different specialty categories. 
Subsection B proceeds with results using degree of specialization vari¬ 
ables for individual physicians. These measures of specialization are 
based on the range of procedures performed by the individual doc¬ 
tor. 

All regressions include the square area of the county as a regressor. 
Since local markets are assumed relevant because of transportation 
costs, inclusion of square area attempts to control for differences in 
transportation costs across counties. It is assumed that these costs rise 
with geographic size. 

A. Using County Measures of Degree of Specialization 

The data used in the following regressions come from the August 
1984 version of the Bureau of Health Professions’ Area Resource File 
and the 1983 County and City Data Book. The data bases consist of a 
wide range of demographic and health-related variables from sources 
including the Census Bureau and the American Medical Association 
(AMA). The unit of observation is the county. 

The strategy employed is to use AMA counts of physicians in dif¬ 
ferent specialties to form an aggregate measure of the degree of 
specialization for each county. 18 The county degree of specialization 


8 These counts come from the AMA Physician Masterfile. Currently, an individual’s 
specialty designation on the majterfik is based on a questionnaire that asks the doctor 
to designate ms primary, secondary, and tertiary specialty along with the “number of 
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is then regressed on the county demand-shifting variables. For ex¬ 
ample, general and family practitioners (GPs) cover a wide range of 
patient problems compared with doctors in other specialty categories. 
One degree of specialization variable used is the ratio of total doctors 
in the county to general and family practitioners in the county 
(DOCGP). Such a variable is clearly not an exact counterpart to the 
choice variable in the theory (number of activities or ills covered by an 
individual doctor), so several other dependent variables were used to 
check robustness. 

Table 2 indicates that physicians in general and family practice, 
internal medicine, and general surgery are generalists in the sense 
that a wide range of diagnoses account for the first 50 percent of their 
office visits. This and related evidence motivates the formulation of 
the degree of specialization variables used in the following analysis. 13 
Variables are defined in table 3. Summary statistics are in table 4. The 
sample includes all counties in the United States, excluding those in 
Alaska, Hawaii, and the District of Columbia. Counties are omitted 
when missing values of variables occur. 

The dependent variables are an aggregate measure of the degree 
of specialization in the county. These are interpreted as proxies for 
the narrowness of the range of activities produced by the typical 
doctor in the county. 

The principal problem with the empirical strategy is that the de¬ 
pendent variables are rather crude measures of their theoretical 
counterparts. While the Rosenblatt et al. (1983) evidence of table 2 is 
highly suggestive of which specialties are most general, there are 
problems that must be noted. Two of these problems are due to 
aggregation. First, table 2 gives a portrait of the range of activities for 
a specialty as a whole. This range may differ from that of a typical 
doctor within the specialty. For example, individual internists may 
perform only one or two of the activities in table 2, but when the data 


hours you spent per typical week” in each designated specialty. In compiling its census 
of types of specialists by county, the AMA categorizes a physician on the basis of the 
specialty in which he spends the most hours. This is not necessarily the one that the 
doctor lists as primary. The AMA’s method of categorization is the appropriate one for 
this study since we are concerned with what a physician does in practice, not with how 
he may prefer to label himself for marketing or other reasons. 

13 Another concern is the issue of some ambiguity in categorizing oneself as a gen¬ 
eral/family practitioner versus an internist. 1 deal with this difficulty by reporting re¬ 
sults for three different formulations of the dependent variable. As table 3 indicates, 
only GPs appear in the denominator of LDOCGP, only internists appear in the de¬ 
nominator of LMEDI, and both GPs and internists appear in the denominator ot 
LMEDIGP. The denominators of these variables represent the number of “generalists 
in the county. 



Diaonos1s Groups Account.no por the F.rst 50 Percent op Ambu^torv V.m» by Phvs.cn Sticum »n in ' 
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‘ TABLE 3 

Variable Definitions 


Variable Name 


Definition 


LDOCGP 


LMEDI 


LMEDIGP 


LPOP 

LV 

LSCH 

COLL 

AGED 

LAREA 

LDOCS 

TEMFW 

TEMPS 

PRECIP 

PROF 

GOVT 

HOTDUM 


D*HOTL 

POPCHNG 


Log(total 1979 nonfederal patient care MDs [excluding hospital 
residents] + general or family practitioners [excluding hospi¬ 
tal residents]) 

Log] 1679 nonfederal patient care specialists in cardiovascular 
disease, gastroenterology, pulmonary disease, or internal 
medicine [excluding hospital residents] + specialists in inter¬ 
nal medicine [excluding hospital residents]) 

Log] 1979 nonfederal patient care medical specialists* or general 
or family practitioners [excluding hospital residents] + gen¬ 
eral or family practitioners or specialists in internal medicine 
[excluding hospital residents]) 

Log] 1980 census population in thousands) 

Log] 1980 per capita income in thousands) 

Log] 1980 median school years, penons 25 + years of age) 
Percentage of persons 25 + years of age with 4 + years of col¬ 
lege 

Percentage of persons 65 or over in 1980 
Log] 1980 land area in square miles) 

Logftotal 1979 nonfederal patient care MDs excluding hospital 
residents) 

Mean January temperature in degrees Fahrenheit 
Mean July temperature in degrees Fahrenheit 
Mean annual precipitation in inches 

Percentage of labor force in professional and related services 
Direct general expenditures per capita by local government 
Dummy variable * 1 if receipts from hotels, motels, and related 
establishments are reported; ** 0 if not reported because of 
violation of confidentiality 

Interaction of HOTDUM and hotel/motel receipts per capita 
Percentage change in population from 1970 to 1980 


Source. —Data come tram the AuguH 1984 venion of the Am ReKWinx File of the Bureau of Health Profei- 
«ion> and from the 1985 CmaUf tad CUf Data Beak of the Bureau of the Cciuu*. The unit of otaervation i> the county. 

* A “medical specialist” u a doctor da willed into one of the following categoric*: internal medicine, allergy, 
cardiovaicular dheaac, dermatology, gaitroenterology. pediatric*, pediatric allergy, pediatric cardiology, or pulmo¬ 
nary diaeaam- 


are aggregated over all internists, we get the list of 11 patient ills in 
the table. To the extent this example is true, it would be incorrect to 
label internists as “general.” 

A second concern is that aggregation smooths over cross-sectional 
differences in the range of activities urithin a particular specialty. Per¬ 
haps general practitioners in less populated locales cover a spectrum 
wider than that in table 2, while those in more populated locales cover 
a much narrower range. The results below using individual data indi¬ 
cate that this does occur. In this case there is a smoothing in the 
aggregate data of just the kind of variation the theory addresses. 
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Summary Statistics for County-Level Data 


963 


Name 

Number of 
Observations 

Mean 

Min 

Max 

LDOCGP 

2,860 

.836 

0 

3.951 

LMEDI 

1,605 

.147 

0 

1.946 

LMEDIGP 

2,877 

.129 

0 

.847 

LPOP 

3,068 

3.206 

-2.397 

8.920 

LY 

3,068 

2.017 

1.104 

3.150 

LSCH 

3,068 

2.442 

1.792 

2.708 

COLL 

3,068 

11.366 

2.8 

47.8 

AGED 

3,068 

13.250 

0 

34.040 

LAREA 

3,068 

6.517 

3.091 

9.907 

LDOCS 

2,911 

2.719 

0 

9.460 

TEMPW 

3,068 

32.865 

1.1 

67.2 

TEMPS 

3,068 

75.853 

55.5 

93.7 

PRECIP 

3,068 

36.251 

2.15 

99.99 

PROF 

3,062 

18.491 

3.906 

48.078 

GOVT 

3,062 

623 

0 

3,151 

HOTDUM 

3,062 

.521 

0 

1 

D*HOTL 

3,062 

43.36 

0 

10,206.53 

POPCHNG 

3,062 

16.436 

-44.512 

232.007 


Aggregate-level evidence supporting the theory overcomes the afore¬ 
mentioned bias. 14 


1. Reduced Form 

Let us now turn to the reduced-form results. Table 5 shows ordinary 
least squares (OLS) results for equation (7). As predicted, degree of 
specialization rises with population. The coefficient is significant and 
positive for all dependent variables. 

Rosett and Huang (1973) and Newhouse and Phelps (1976) gener¬ 
ally find small, positive income elasticities of demand for medical care. 
The per capita income coefficients here are significantly different 
from zero for only one of the dependent variables. 


14 Another difficulty in interpreting the aggregate-level results arises from the 
"added-activities" effect. Two components may be captured by the aggregate specializa¬ 
tion variables. The first, which is the central focus of the theory, is variation in the 
number of activities performed by the typical doctor in the county. The second, the 
added-activities effect, refers to variation in the number of activities performed by any 
doctor in the county. To the extent that specialty classification depends on an individ¬ 
ual’s performing special procedures that simply are not performed in less populated 
counties, the aggregate-level specialization variables may smear together the two com¬ 
ponents. The added-procedures effect corresponds to the discussion in Newhouse eiai. 
(1982). 
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Aggregate Degree of Specialization Regressions (Reduced Form, OLS) 


Independent 

Variable 


Dependent Variable 


LDOCGP 

LMED1 

LMED1GP 

Intercept 

-.501** 

-.329* 

.044 


(.245) 

(.186) 

(.065) 


(.214) 


(.067) 

LPOP 

.368*** 

.068*** 

.076*** 


(.008) 

(.005) 

(. 002 ) 


(.008) 


(. 002 ) 

LY 

.134*** 

.010 

.017 


(.050) 

(.032) 

(.013) 


(.049) 


(014) 

LSCH 

-.100 

.077 

-.076** 


(115) 

(.083) 

(.030) 


(.105) 


(.031) 

COLL 

.030*** 

.000 

.007*** 


(. 002 ) 

(. 001 ) 

(. 001 ) 


(. 002 ) 


(. 001 ) 

AGED 

.003 

.003** 

-.001 


(. 002 ) 

(. 001 ) 

(. 001 ) 


(. 002 ) 


(. 001 ) 

LAREA 

-.047*** 

-.007 

-.013*** 


(. 012 ) 

(.006) 

(.003) 


( 012 ) 


(.003) 

R* 

.605 

.148 

.483 

Sample size 

2,860 

1,605 

2,677 


Note. —The heterosceditttidiy of While (1980) hit been performed for nil OLS rcgrcuions. If the null of 

homotcedattictty U rejected at« • .05, two lundard error* (in parentheten) are reported; the fim i* the usual OLS 
standard error; the second. White's asymptotically correct standard error under heteroacedastieity. Hypothesis tests 
for parameters differing from aero make use of White's standard error iff homoscedasticity » rejected. 

• Significant at .10. 

** Significant at .05. 

**• Significant at .01. 


The college variable has a positive effect on specialization for most 
of the dependent variables. It may be picking up several effects. First, 
it indicates a fatter right tail of the income distribution. Second, it may 
indicate a more insured population. 15 The variable COLL may also be 
combining the effects of education on both the wage and the effi¬ 
ciency of household production of health capital. 1 Finally, more 
educated consumers may be more efficient searchers for a prac¬ 
titioner who provides just those services they need. There is a lower 
"search tax” on being specialized, and therefore there is relatively 
more specialization. 

The AGED variable is included as a demand shifter. It shows 

“ Mitchell and Phelps (1976) find employer-paid medical insurance rising with in¬ 
come. Thu may reflect both income and substitution effects since the tax treatment of 
employer-provided health insurance causes an inverse relationship between the mar¬ 
ginal price of insurance and the employee’s income. 

18 See the discussion in Grossman (1972). 
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significance in one of the specifications. The variable LAREA stan¬ 
dardizes the counties for transportation cost differences. One would 
expect a negative coefficient since transportation costs act like a tax on 
medical services. The point estimates are all negative, with most of 
them significant. 17 

Table A1 in the Appendix reports Tobit results for the specifi¬ 
cations of table 5. The Tobit procedure is applied because of the 
lower truncation of the dependent variables at a value of zero (see 
Tobin 1958). The qualitative results of the Tobit are quite similar to 
the OLS results. The magnitudes of the LPOP coefficients are higher 
with the Tobit model. 

Table 6 contains coefficient estimates of equations (1) and (10). The 
elasticity of doctors with respect to population is significantly above 
one. 18 Per capita income and the college variable are both positive and 
significant. This is consistent with direct income effects as well as with 
the effect of relatively lower prices of medical insurance for those 
with higher marginal income tax rates. 

The AGED variable also has a positive effect on the number of 
doctors. This reflects greater demand for medical services by the 
elderly. They may have a greater demand for services for two rea¬ 
sons. As Grossman (1972) has argued, if the depreciation rate of 
health capital rises with age, one might expect increased demand for 
medical inputs at older ages even while health continues to decline. 19 
A second reason for greater demand by the elderly is Medicare 
coverage. 20 

17 In Baumgardner (1986) results are repotted on a sample using 1970 census data 
and 1975 physician counts. The reduced-form resulu are very close to those reported 
here. The issue of border-crossing of patients into large cities to receive medical ser¬ 
vices can be partially controlled by allowing a separate intercept and LPOP coefficient 
for larger cities. Doing this for counties containing the core city of standard metropoli¬ 
tan statistical areas of 1,000,000+ population gave the following results: higher inter¬ 
cepts for these counties, LPOP coefficients near zero for these counties, and signifi¬ 
cantly positive LPOP coefficients for the non-core city counties. This latter result is 
especially important since it indicates that the significant LPOP coefficients reported in 
this paper do not occur because there are only specialists in large central cities and 
generalists elsewhere. Instead, the increase in specialization of the representative doc¬ 
tor with local population remains even when a separate large-city effect is allowed. 

18 This result is consistent with each of the market structure assumptions of the 
theoretical model. Baumgardner (1988) shows that the cooperative model necessarily 
implies an elasticity of doctors with respect to local population greater than or equal to 
one. 

19 The technical requirement for this result is that the marginal efficiency of capital 
be inelastic. 

20 Column 2 of table 6 includes local climate variables. These variables are included as 
amenities that may shift the supply of doctors. Of course, to the extent that climate 
affects health or selects the latent health of the population, these ctimatc variables take 
the role of demand shifters. Other supply shifters/amenities included in col. 2 of table 6 
follow those used by Fuchs (1978) and Pauly and Satterthwaite (1981). See the discus¬ 
sion on pp. 502-5 of Pauly and Satterthwaite. 
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results are mixed. In the LDOCGP equation, the positive effect of 
number of doctors rejects pure price taking and is consistent with 
either the cooperative or noncooperative. The negative effect of 
UPQP is not consistent with any of the three forms. The negative 
effect of square area adds to the evidence that geographically local 
markets are important to specialization decisions. The LMEDI equa¬ 
tion does not show significant slope coefficients at conventional levels. 
The LMED1GP equation shows a significant positive effect of number 
of producers with the effect of local population near zero. If one 
maintains either the cooperative or the noncooperative as one’s null 
hypothesis, neither would be ruled out. The significant effects of 
number of producers and some other demand shifters are inconsis¬ 
tent with pure price taking. Again we see a negative effect of geo¬ 
graphic size. 


B. Using Individual Measures of Degree 
of Specialization 

This subsection uses individual-level data to provide further tests of 
the theory and further description of the relationship between degree 
of specialization and local demand variables. Also, the effects of indi¬ 
vidual productive endowments on specialization are examined. 

The micro-level data avoid the aggregation problems of Section 
IIIA. The micro specialization variables are formed from informa¬ 
tion cm procedures performed by each doctor in the sample. This 
allows a closer match of the specialization variables to their theoretical 
counterpart than was true of the aggregate-level variables. 

Two data sources are used. The physician data come from the 
second quarter 1982 Socioeconomic Monitoring System of the AM A. 
The county-level data come from the sources described earlier. Merg¬ 
ing the data on each doctor with the data on the county in which he or 
she practices prides the micro data file used in the following empir¬ 
ical tests. The individual doctor is the unit of observation in this 
merged data set. Variable names and definitions appear in table 8. 
Summary statistics appear in table 9. 

Because of the nature of the data on procedures performed, the 
analysis must be performed within a specialty group. The reason for 
this is that the survey instrument asks about different sets of proce¬ 
dures depending on the physician’s specialty classification. The re¬ 
gressions appearing here are confined to GPs. The GP specialty 
group has three desirable characteristics for empirical analysis: (1) a 
relatively large sample size, (2) a relatively large set of procedures 




GPHERF 


LHRS 

BOARD 

SEX 

EXP 


PY05 

PY10 

PY15 

PY25 

PY50 

PYGT50 

EL 

HSGRAD 

LGPMEDGS 


Sennet.—Second quarter 1982 Socioeconomic Monitoring Syuem of (be American Medical Allocation, Auguit 
1984 Area Reaource File of the Bureau of Health Fro fet aiom. and 1983 County mad City Data Beat of the Bureau of 
the Cenaua. Unit of abaervation it the phytician 

Non.—Other variablea used are defined in table 3. 

surveyed (seven, compared with five for most other specialties), and 
(3) observations covering a wide range of county populations. 22 

The variable measuring an individual’s degree of specialization 
(GPHERF) is a Herfindahl index equal to the sum of the squared 
shares of seven medical procedures performed by the respondent in 

** The GP group has a usable sample size of 537. The AMA survey used m the 
analysis has only one specialty with anywhere due to as targe a sample: this is the 
internal medidnespecialty, with 461 usable observations. A drawback is that the inter¬ 
nists are surveyed on only four procedures, thus allowing for less variation in the 
dependent variable. Finally, only 11 internists in the survey practice in counties with a 
population below 28,788, while 106 GPl in the survey practice in such counties. 


A. Variables Referring to the Individual Doctor 

2 ?_t (tf); Si » pi/(Z] . \pi), where p t m number of procedures of type 
i performed by the practitioner in the last month; the seven pro¬ 
cedures are annual-type exam, appendectomy, vaginal delivery, 
caesarian section, electrocardiogram, radiologic exam of upper 
gastrointestinal tract, and radiologic exam of chest 
Log(annual hours worked): weeks worked last year times hours 
worked in last complete practice week 
Dummy variable = 1 if board certified 
Dummy variable «= 1 if female 
Years since medical school graduation 

B. Variables Referring to Doctor's County of Practice 

Percentage of households with 1975 personal income < 35,000 
Percentage of households with $5,000 £ personal income £ 

$9,999 

Percentage of households with $10,000 £ personal income £ 
$14,999 

Percentage of households with $15,000 £ personal income £ 
$24,999 

Percentage of households with $25,000 £ personal income £ 
$49,999 

Percentage of households with personal income a $50,000 
Percentage of age 25+ with > 9 years of school 
Percentage of age 25 + completing a: 12 years of school but < 4 
years of college 

Log(total 1979 nonfederal patient care general or family prac¬ 
titioners or medical specialists or general surgeons excluding 
hospital residents) 
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TABLE 9 

Summary Statistics rok General and Family Practitioner Data Set 


Name 

Number of 
Observations 

Mean 

Min 

Max 

GPHERF 

537 

.577 

.183 

1.000 

LHRS 

526 

7.848 

4.625 

8.841 

BOARD 

537 

.469 

0 

1 

SEX 

537 

.061 

0 

1 

EXP 

537 

23.713 

2 

59 

LPOP 

537 

5.049 

1.265 

8.920 

LY 

537 

2.194 

1.494 

2.752 

LSCH 

537 

2.474 

1.946 

2.639 

COLL 

537 

15.5 

5.0 

42.8 

AGED 

537 

11.793 

2.762 

30.675 

LAREA 

537 

6.624 

3.091 

13.255 

PY10 

537 

17.559 

7.334 

33.908 

PY15 

537 

15.268 

9.503 

23.462 

PY25 

537 

24.474 

12.240 

31.999 

PY50 

537 

20.567 

7.472 

41.310 

PYGT50 

537 

3.740 

.583 

16.156 

EL 

537 

19.1 

4.2 

59.9 

HSGRAD 

537 

50.4 

20.3 

66.3 

LGPMEDGS 

537 

4.477 

0 

8.798 

TEMPW 

536 

34.8 

1.3 

67.2 

TEMPS 

536 

75.1 

56.3 

93.7 

PRECIP 

536 

35.95 

2.15 

90.82 

PROF 

532 

19.825 

10.862 

47.922 

GOVT 

532 

707 

0 

2,558 

HOTDUM 

532 

.780 

0 

1 

D*HOTL 

532 

67.71 

0 

2,632,42 

POPCHNG 

532 

16.736 

-20.570 

154.94 


the last month (see table 8); GPHERF can take values ranging from 
1/7 to I. The minimum value occurs if the doctor performed an equal 
number of all seven procedures, while the maximum value occurs if 
the doctor performed only one of the seven types of procedures. A 
greater value of GPHERF reflects a greater degree of specialization of 
the practitioner. 2 * Because the seven procedures asked as part of the 
survey are only a subset of all medical procedures, we must assume 
that the variation across doctors in performance of the seven is cor¬ 
related with the variation in performance of all procedures. 24 


15 Another way to tee ibis is to notice that GPHERF is monotonic in Eudideai 
distance. It takes a larger value the neater is the distance between the vector of proce 
dure shares actually p erf ormed and the vector of equal procedure shares. 

M The Appendix contains a sample of regressio n results replicating those report* 
here using a different dependent variable (see table A2). The alternative tpecializatio 
variable simply counts how many of the seven types of procedures were performed b 
the GP in the last month. This variable (GPCOUNT) can run from zero to seven, 
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1 view the micro evidence below at further and deeper evidence of 
variation in the degree of specialization as a function of local demand 
for medical services. There are two senses in which the micro-level 
work investigates differences in the degree of specialization at a finer 
level than the aggregate work does. First, the specialization variable is 
based on procedures performed by the individual rather than on 
numbers of different kinds of specialists in each county. Second, we 
are looking at differences in the degree of specialization within a 
specialty category. 

1. Reduced Form 

Table 10 reports OLS results of GPHERF regressed on local demand 
shifters (eq. [7]). The results are consistent with the theory. The vari¬ 
able LPOP is strongly significant in all three specifications, indicating 
that specialization of GPs into a narrower range of practice activities 
increases with local population. Specialization rises with the percent¬ 
age of the population that is elderly (AGED). 

While not statistically significant, the sign and magnitude of 
LAREA are consistent with the theory across all specifications. Be¬ 
cause of increased transportation costs, greater area acts as a greater 
tax on demand for medical services. If transportation costs rise 
linearly with distance, transportation costs rise with the square root of 
area (assuming circular local markets with evenly distributed popula¬ 
tions). Therefore, one expects the LAREA coefficient to have a 
smaller magnitude than LPOP. 

Other variables are not significant. As found in some of the regres¬ 
sions using aggregate data, the county’s per capita income does not 
affect specialization. A breakdown of the county population by per¬ 
centage in different income brackets (specification 2) shows no signs 
significantly different from zero. 

The variable COLL is significantly negative in specification 3 and in 
the Tobit results of table 11. This sign contradicts the aggregate data 
reduced-form results, where COLL was found to be, if anything, a 
positive influence on specialization. The negative sign on COLL in the 
micro data is inconsistent with the cooperative form of the model 
since COLL is associated with more doctors but less specialization. 
Even if we restrict our measure of N to the number of GPs, we find 


greater value corresponding to a lesser degree of specialization. An advantage of 
GPCOUNT is its correspondence to the number of activities performed. A disadvan¬ 
tage of GPCOUNT is that a procedure performed only once or twice in the last month 
is weighted as heavily as one performed many times (GPHERF weights such proce¬ 
dures less heavily). 



TABLE 10 

Dkg mu' or Specialization Regressions, Individual Data (Reduced Form, OLS) 


Independent 

Variable 


Dependent Variable: GPHERP 


(1) 

(2) 

(3) 

Intercept 

.272 

.608 

.613** 

(.484) 

(.522) 

(.299) 

(.293) 

LPOP 

.034*** 

.036*** 

.030*** 


(.008) 

(.007) 

(.009) 

(.009) 

LY 

.043 

(.076) 


.077 

(.081) 

(.080) 

LSCH 

.068 

(.217) 

.013 

(•225) 


COLL 

-.004* 

-.003 

-.006** 


(.002) 

(.003) 

(.003) 

(.003) 

AGED 

.006** 

.009*** 

.006** 


(.003) 

(.003) 

(.003) 

(.003) 

LAREA 

-.020* 

-.015 

-.015 


(Oil) 

(Oil) 

(Oil) 

(.011) 

PY10 


-Oil 

(.006) 

. . . 

PY15 


.002 

(.008) 


PY25 

. . . 

-.000 

(.005) 


PY50 

‘ * * 

-.001 

(.004) 


PYGT50 


-.004 

(.009) 

. . . 

EL 

* * * 

• < • 

-.003 

(.004) 

(.004) 

HSGRAD 

’ * * 


-.004 

(.003) 

(.003) 

ft* 

.060 

.068 

.062 

Simple size 

537 

537 

537 


Non.—Sec note* to ubte 5. 


97 * 













TABU II 

Degree of Specialization Regressions, Individual Data (Reduced Form, Tobit) 


Independent 

Variable 


Dependent Variable: GPHERF 


(0 

<2> 

(3) 

Constant 

.256 

.604** 

.614*** 


(.220) 

(•237) 

(1«) 

LPOP 

.055*** 

.037*** 

.081*** 


(.004) 

(.005) 

(.004) 

LV 

.042 

(•054) 


.076** 

(.057) 

LSCH 

.074 

(.098) 

.015 

(102) 


COLL 

-.004*** 

-.005** 

-.006*** 


(.001) 

(.001) 

(.001) 

AGED 

.006*** 

.009*** 

.006*** 


(.001) 

(.002) 

(.001) 

LAREA 

-.021*** 

-.016*** 

-.016*** 


(.005) 

(.005) 

(.005) 

PY10 


-.011*** 

(.005) 


PY15 


.002 

(.005) 


PY25 


.000 

(.002) 


PY50 


-.001 

(.002) 

* « • 

PYGT50 


-.004 

(004) 


EL 


.. . 

-.005* 

(.002) 

HSGRAD 



-.004** 

(.002) 

s 

.100 

.100 

.100 

Sample size 

537 

537 

557 


Note. —Out of 537 obwrvuRxu, 459 arc nonUmit. 78 arc at the upper truncation point of I , and none are at the 
lower truncation point of .143. To obtain the dope of the conditional mean function of GPHERF at regmeor vahiea 
it, multiply the coefficient eatimatet by 


K ja r !S } 

where TO) it the itandardned normal diurtbudon function, h h the rector of coefficient etdmatet, and i h the 
ntimate of o. The conditional mean it 

EfGPHERFla) - [^— ~ ) " > + y - ’ - S )]h'n 

where /<•) a the standardized normal density. 
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that greater COLL implies more GPs but, as tables 10 and 11 indicate, 
less specialized GPs. ss 

Table 11 reports results using a two-limit Tobit empirical mode 
(Rosen and Nelson 1975). The justification for the Tobit model is the 
boundedness of GPHERF between 1/7 ( = 0.145) and l. 26 The Tobit 
estimates are very close to those from OLS with asymptotic standarc 
errors lower than the OLS standard errors. 

The individual-level reduced-form results reject the pure price- 
taking form of the model since local demand variables affect degree 
of specialization. The evidence also rejects the cooperative in favor of 
the noncooperative. 


2. Separate Effects of Demand Variables 
and Number of Doctors 

Having reviewed the reduced-form results from the micro data set, 
let us turn to the issue of the separate effects of local number of 
practitioners and demand shifters on the degree of specialization. 
The residual test results in table 12 indicate that a GP practicing in a 
locale with more doctors than expected on the basis of observables is 
more specialized than expected. This result continues to hold when 
measures of the individual’s endowments are included among the 
observables (col. 2 of table 12; endowments are discussed in subsec¬ 
tion 5 below). The residual test rejects neither the cooperative nor the 
noncooperadve form of the model but rejects pure price taking. 27 

Second-stage results (for eq. [9]) appear in table 15. In theory, we 
are attempting to separate the pure effect of demand shifters on S 
from the effect of the number of local producers on S. It is appropri¬ 
ate to use the number of doctors within specialties that potentially 
produce activities that are also potentially produced by GPs rather 


* 5 A regression of log(GPs) on demand variables as in col. I of table 6 has significant 
positive coefficients on LPOP (0.705), LPOP 3 (0.015), and COLL (0.013). One may 
argue that LGPMEDGS is a better measure of number of local producers for the micro 
data since this variable includes all specialists who produce activities GPs might provide. 
Use of this measure does not change any of the arguments above. A regression of 
LGPMEDGS on demand variables also shows a significant positive coefficient on LPOP 
(1.187), an insignificant coefficient on LPOP 3 (0.004), and a significant positive coef¬ 
ficient on COLL (0.033). 

*® Actual specialization (SPEC) is assumed linear in the independent variables; 

SPEC - 0'X + u; u - Af(0, o 3 ), (11) 


GPHERF - 


SPEC if 0.143 < SPEC < 1 
0.143 if SPECS 0.143 
1 if SPEC as 1. 


( 12 ) 


37 These results match those of the analogous residual tests on the aggregate-level 
data. Aggregate-level residual correlations are reported in table A3 of die Appendix. 



TABLE 12 

Residual Tuts for Micro Data Set 


Residual Correlations 

(1) 

(2) 

.122 

.127 

N « 537 

Af - 526 

Non—Col. 

1 contains residual correlations from 


spedsliiatitffi regression 1 of table 10 iod I doctor*' sup¬ 
ply regression with the quadratk specification at in cot. 
1 of table 8. Col. i contain! rcaidua) correlations front 
specialization regression ! of table 14 and a doctors' sup- 
p if regression with the quadratic specification as in col. 1 
of table 6. N refers to the sample size for each pair of 
regressions. Both correlations are significant at a * .01 

(cwo-sidtd). 


TABLE 13 

Second-Stage Degree of Specialization Regressions (Two- Stage Least Squares) 


Independent 

Variable 

Dependent Variable: GPHERF 

(1) 

(2) 

Intercept 

.390 

.929* 

(.507) 

(.525) 

LGPMEDGS 

.037 

.007 


(.063) 

(.060) 

LPOP 

-.008 

.014 


(.068) 

(.065) 

LY 

.044 

.061 


(.080) 

(.077) 

LSCH 

.045 

.027 


(.217) 

(212) 

COLL 

-.005* 

-.002 


(.003) 

(.003) 

AGED 

.005 

.004 


(.004) 

(.003) 

LAREA 

-.019* 

-.008 


(Oil) 

(OH) 

LHRS 

, , , 

-.082*** 



(.023) 

BOARD 

ewe 

-.032 



(.020) 

SEX 

WWW 

.017 



(.039) 

EXP 

o W # 

-1.2 x HT* 



(2.7 x KT») 

EXP* 


9.2 x 10-** 



(5.1 x HT*) 

R* 

.059 

.148 

Sample size 

532 

521 


Nan.— The first •stage regr ession is a reduced-form LGPMEDGS regression with independent rariabte* as in col. 
leftahleft. 
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than to use the total number of doctors for the measure of number of 
local producers- Therefore, LGPMEDGS is used as the empirical 
measure of A r . 28 The COLL and LAREA coefficients are of border¬ 
line significance, but the regression is unable to separate the effects of 
population versus number of producers. In the light of the significant 
effects of LPOP in the reduced-form equations in tables 10 and 11, 
the poor results of the two-stage estimation are likely because so much 
of the variation in LGPMEDGS is explained by variation in LPOP. 

3. Individual Endowments 

Let us now examine the effects of individual endowments on degree 
of specialization (eq. [8]). The log of estimated hours worked last year 
(LHRS) is entered as a measure of the work time endowment. 
BOARD (= 1 if board certified) may reflect latent productive endow¬ 
ment, but there are other considerations that may affect one’s ability 
and decision to become board certified. 29 SEX (= 1 if female) may 
reflect the level of total capital if female physicians anticipate less time 
in the market. The experience variables (EXP and EXP 2 ) may capture 
many effects. Note that experience is not true experience but is, 
instead, years since medical school graduation. The variable EXP 
confounds effects of vintage, age, experience in the profession, and 
tenure in a community. The tenure in a community may affect 
specialization through its effect on the demand faced by the individ¬ 
ual doctor. If reputation is important to consumers, doctors with low 
tenure (who will tend to have lower EXP) will face lower demand, 
which in turn may affect their degree of specialization. 

Turning to the regression results of tables 14, 15, and 16, we find a 
significant, negative effect of LHRS on degree of specialization. 
Board-certified GPs are found to provide a wider range of proce¬ 
dures, while sex has no significant effect. Table 16 shows that the 
negative effect of LHRS remains when work time is treated as endog¬ 
enous and is estimated using two-stage least squares. 

Years since medical school graduation (EXP) has a U-shaped effect. 
The Tobit results (table 15) indicate more generalization over the first 
6 years after graduation with increased specialization thereafter. This 
result conforms with several hypotheses. It may be that one’s capacity 
for general knowledge peaks in the tow- to mid-30s, and this is cor- 


** There b no material change in results if either LDOCS or Iog(nurober of GPs) is 
used as the measure of N. 

*® Rather than productive capital endowment, board certification may indicate pro¬ 
ficiency in an already narrow set of activities. The decision to become certified may 
reflect the local practice environment since many hospitals (especially in larger cities) 
have turned to certification as an important criterion for staff privileges. 
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TABLE 14 

Deoux or Specialization Regressions Using Measures of Physicians’ 
Endowments (Reduced Form, OLS) 


Independent 

Variable 

Dependent Variable: G PH ERF 

(1) 

(2) 

Intercept 

.978* 

.921* 

(.510) 

(.502) 

LPOP 

.026*** 

.024*** 


(.008) 

(.008) 

LY 

.065 

.047 


(.075) 

(.073) 

LSCH 

.065 

.053 


(.214) 

(.210) 

COLL 

-.003 

-.002 


(.002) 

(.002) 

AGED 

.006** 

.004 


(.003) 

(.003) 

LAREA 

-.014 

-.011 


(.010) 

(.010) 

LHRS 

-.092*** 

-.085*** 


(.023) 

(.023) 

BOARD 

-.067*** 

-.034* 


(.019) 

(.020) 

SEX 

• • • 

.014 



(.039) 

EXP 


-7.9 x HT* 



(2.7 x 10-*) 

EXP 1 

• • • 

8.3 x 10"* 



(5.1 x 10-*) 

«* 

.111 

.150 

Sample size 

526 

526 


Note.—S ee note* 10 (able 5 . 


related with the range of activities produced. A second explanation is 
that the pattern reflects the outcome of search for those activities in 
which one has a comparative advantage. Early in one’s career many 
activities are sampled to judge one’s ability. As information is ac¬ 
cumulated over time, activities are dropped when ability in them has 
been revealed to be sufficiently low. The evidence is also consistent 
with the reputation argument. Since more moves of doctors to an¬ 
other community occur in the early years of practice, 30 the evidence is 
consistent with an increased demand for a particular doctor with 
increases in his tenure in the community. 

Both physician endowments and local demand shifters affect the 


*° Unpublished tabulations by William 0. Marder report that 16.41 percent of physi¬ 
cians out of medical school less than 5 years moved practice location in 1980. Only 7.02 
percent of physicians 6-15 years out of medical school moved. 
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TABLE 15 

Degree or Specialization Regressions Using Measures or Physicians' 
Endowments (Reduced Form, Tobit) 


Independent 

Variable 

Dependent Variable: GPHERF 

(D 

(2) 

Constant 

.993*** 

.944*** 


(.237) 

(.238) 

LPOP 

.028*** 

.025*** 


(.004) 

(.004) 

LY 

.062* 

.046 


(.035) 

(.035) 

LSCH 

.073 

.059 


(.100) 

(.100) 

COLL 

- .003*** 

-.002*** 


(.001) 

(.001) 

AGED 

.006*** 

.005*** 


(.001) 

(.001) 

LAREA 

-.015*** 

-on** 


(.005) 

(.005) 

LHRS 

- .097*** 

-.090*** 


(.011) 

(Oil) 

BOARD 

-.070*** 

-.036*** 


(.009) 

(.009) 

SEX 


.012 

(.019) 

EXP 


-1.1 x 10'* 

(1.3 x 10-*) 

EXP* 


9.2 x lO - **** 
(2.4 x 10-*) 

5 

.100 

.100 

Sample size 

526 

526 


Nam.—Out of S26 obaervaikxu, MS nt nonlimh. 77 arc at the upper truncation point (I), and none arc at it 
lower truncation point (.MS). Other formulae are equivalent to thoee of the notes to table 11. 


degree of specialization. The demand shifters found to affect speciali 
zation in tables 10 and 11 remain significant and of the same mag 
nitudes when the endowment variables are included. In comparison 
of two GPs who work the same hours and are otherwise identical, th 
practitioner in the more populated locale will tend to be more special 
ized. In comparisons of two GPs in the same locale, the one who work 
more hours will, on average, practice a wider range of activities. 

IV. Concluding Comments 

1 have presented empirical evidence from the physicians’ service 
market based on a theoretical model of variation in the division 
labor across geographically local markets. The pure price-taking forr 
of the model is dearly rejected. Evidence is equivocal concerning th 
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TABLE 16 

Decree of Specialization Regressions 
Instrumenting Hours Worked 
(Two-Stage Least Squares) 


Independent 

Variable 

Dependent Variable: 
GPHERF 

Intercept 

2.586*** 


(.Sll) 

LPOP 

.020** 


(.010) 

LY 

.127 


(.084) 

LSCH 

.022 


(.229) 

COLL 

-.004* 


(.002) 

AGED 

.006** 


(.003) 

LAREA 

-.007 

/X 

(.012) 

LHRS 

- .277*** 


(102) 

BOARD 

- .053** 


(.022) 

FMG 

-.057* 


(.033) 


.092 

Sample size 

521 


Non.—FMC ■ 1 if doctor is graduate of foreign medical 
school. The instrumental variables are TEMPW, TEMPS, PRECIP. 
PROF. GOVT. HOTDUM, D*HOTL, POPCHNG. SEX. EXP. 
EXP 1 , and ocher exogenous variables of this cable. 


cooperative versus the noncooperative forms of the model. Two-stage 
results from the aggregate-level data tend to support cooperation 
since the partial effect of number of doctors on specialization is posi¬ 
tive while the partial effect of population on specialization tends not 
to be positive. However, the micro-level reduced-form results tend 
to support noncooperation. The finding that percentage college- 
educated has a negative effect on specialization but a positive effect 
on number of practitioners is inconsistent with cooperation. Evidence 
from both levels of aggregation supported the following conclusions: 
the degree of the division of labor increases with local population; 
some other demand-shifting variables for medical services are also 
related to the degree of specialization of local physicians; and, when 
demand variables such as population are held constant, geographi¬ 
cally larger counties exhibit less division of labor. 

The micro-level evidence, which measures the degree of division of 
labor using data on the procedures performed by individual physi- 
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dans, indicates that specialization occurs at an even finer level than 
can be demonstrated at the aggregate level. We find that the range 
of procedures performed by general and family practitioners varies 
systematically with local demand shifters for medical services. Such 
within-specialty variation in range of procedures performed as a 
function of local demand variables contradicts the view that a doctor’s 
specialty classification suffices as a measure of the range of activities 
he or she performs. Another interesting result from the micro data is 
that general/family practitioners who work more hours perform a 
wider range of activities. 


Appendix 


TABLE A1 

Aggregate Degree or Specialization Regressions (Reduced Form, Tobit) 


Independent 

Variable 


Dependent Variable 


LDOCtJP 

LMED1 

LMEDIGP 

Constant 

-.895*** 

-1.773*** 

-.342*** 


(.292) 

(.427) 

(131) 

LPOP 

.430*** 

.165*** 

.143*** 


(.010) 

(.010) 

(.004) 

LY 

.058 

.008 

-.056** 


(.060) 

(.064) 

(.027) 

LSCH 

.035 

.446** 

.035 


(.136) 

(.187) 

(.061) 

COLL 

.029*** 

.000 

.007*** 


(.002) 

(.002) 

(.001) 

AGED 

.001 

.005* 

-.004*** 


(.003) 

(.003) 

(.001) 

LAREA 

-.049*** 

-.013 

-.017*** 


(.013) 

(.012) 

(.006) 

5 

.507 

.319 

,194 

Sample size 

2,860 

1,605 

2,877 


Non.—Thr model atrumes dull actual tpetiahzation (SPEC) is linear in the independent variable*: 
SPEC - P'X + a: ■> - N(t>. it*). 


ter DCP be the observed variable. Then 

( SPEC if 0< SPEC 
0 if SPEC SO. 

The LDOCGP. LMED1, and LMEDIGP regreuio™ have 487, 820, and 1,480 obaervaboni at the lower iraitutbn 
point (0), terpecuvely. To obtain the slope of the conditional mean function of DEP at regreraor valuer a, multiply 
the coefficient rrtbnatet by /(h'l/j). where PH la the uandardiied normal dinribution function, h it the etumate o( 
S, and i s the ertimate of a. The conditional mean u 

E(PEn,).p(^)h',4^). 

where /(•) la the itandardized normal denrtty 
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TABLE A2 

Decree of Specialization Regressions 
Using GPCOUNT (Reduced Form, OLS) 


Independent 

Variable 

Dependent Variable: 
GPCOUNT 

Intercept 

5.442** 

LPOP 

(2.373) 

-.251*** 

LY 

(-041) 

.477 

LSCH 

(.372) 

-1.560 

COLL 

(1.062) 

.013 

AGED 

(.010) 

-.018 

LAREA 

(.014) 

.195*** 

R* 

(.051) 

.097 

Sample size 

537 

Note. —Set notes to table 5. 

TABLE AS 

Residual Tests 

Dependent Variable 

Residual Correlation 

LDOCGP (N = 2,860) 

.538 

LMEDI (N = 1,605) 

.185 

LMEDIGP (N = 2.877) 

.334 


Non.—These are the residual correlations from specialization 
regressions of table 5 and doctors’ supply regressions with specifi¬ 
cation as in col. 1 of table 6. N refen U> the sample size- AH correla¬ 
tions are significant at t* • .001 (two-sided). 
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Are Consumers Ricardian? Evidence for the 
United States 


Paul Evans 

Ohio State University 


This paper derives three empirical implications of a well-specified 
model of consumer behavior, a model that nests both Ricardian 
equivalence and an alternative, non-Ricardian theory. Ricardian 
equivalence is then tested against this alternative. The tests reveal 
that Ricardian equivalence cannot be rejected. 


I. Introduction 

In conventional macroeconomic analysis, government debt affects the 
economy because households view it as net wealth. The larger the 
government debt is, the wealthier households feel and the more they 
consume. In principle, however, households need not view govern¬ 
ment debt as net wealth. David Ricardo (Sraffa 1951) pointed out that 
households might conceivably treat the future taxes servicing the gov¬ 
ernment debt as exactly offsetting it. Barro (1974) has shown that 
maximizing households will actually do so if they can accurately 
foresee future taxes, if they face perfect capital markets, and if they 
have effectively infinite horizons. “Ricardian equivalence” is said to 
hold if households do treat future servicing taxes as an exact offset to 
the government debt. 

Most macroeconomists model government debt as net wealth be¬ 
cause they regard the assumptions necessary for Ricardian equiva¬ 
lence as unrealistic. Like perfect competition, however, Ricardian 
equivalence may be a good approximation in many applications even 

I thank John Campbell, Roger Kormendi, Pok-Sang Lam, Eric Leeper, Nelson Mark, 
Sam Peltzman, and John Sea ter for helpful com menu, 
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ifthe assumptions underlying it are unrealistic. Therefore, empir 
analysis rather than introspection should determine how govcrnnv 
debt is modeled. This paper provides such an analysis. 

If Ricardian equivalence is a good approximation, househo 
should largely disregard government debt in their consumption d< 
sions; if Ricardian equivalence is a poor approximation, househo 
should consume appreciably more, the larger the government debt 
This paper investigates which of these alternatives is more consist! 
with quarterly U.S. data. 

Other papers have investigated this issue empirically. For exam; 
Kochin (1974), Barro (1978), Tanner (1979), Seater (1982), K 
mendi (1983), Aschauer (1985), and Seater and Mariano (1985) fi 
no evidence that households consume more, the larger the gove 
ment debt is. In contrast, Feldstein (1978, 1979, 1982), Blinder • 
Deaton (1985), Boskin and Kotlikoff (1985), and Modigliani and S 
ling (1986) do And such evidence. However, none of these pap 
derives the consumption function it estimates from a weil-specif 
model that nests both Ricardian equivalence and an alternative 
which households regard government debt as net wealth. For 1 
ample, many of the papers motivate the models that they estimate 
appealing to the life cycle model, which does not nest Ricardi 
equivalence. Still others appeal to the permanent-income moc 
which does not nest any alternative to Ricardian equivalence. 

Blanchard (1985) has provided one of the few models in the lite 
ture that does nest Ricardian equivalence and such an alternati 
Depending on whether a crucial parameter is zero or positive, hou 
holds have infinite horizons, internalize all future generations, • 
exhibit Ricardian behavior; or they have finite horizons, are at le 
somewhat disconnected from future generations, and exhibit nt 
Ricardian behavior. The model has testable implications that this | 
per examines. No evidence is found for Blanchard’s alternative 
Ricardian equivalence. 

The rest of the paper is organized as follows. Section II lays : 
Blanchard’s model and then derives three of its testable implicatio 
Section III discusses the data used in the empirical analysis. Seen 
IV tests two of the implications. Sections V and VI use interventi 
analysis to test the third implication. The interventions considered 
Section V are the major tax changes enacted during the postv 
period. Section VI examines the period since the Reagan tax cut 
period that Poterba and Summers (1987, p. 389) have termed 
natural experiment for testing the Ricardian equivalence propc 
tion." Section VII summarizes the paper and reviews a burgeon! 
literature that supports Ricardian equivalence. 
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H. Time Empirical Implications of Ricardian 
Equivalence 

A. Blanchard's Model 1 

Blanchard assumes that households face perfect capital and insurance 
markets but have finite horizons because a fraction p of them dies 
each period. Given these assumptions and some assumptions about 
preferences and the distributions of income and wealth, Blanchard 
derives the aggregate consumption function 2 

00 

C, - a[(l + R,)A,-i + £(1 - (1) 

where C, is aggregate consumption during period t, A,_ 1 is the stock 
of assets outstanding at the end of period t — 1, R, is the real holding- 
period yield during period t on the assets carried over from period 
t — 1, E t is the expectations operator conditional on the information 
known by households during period t, W, is aggregate disposable 
wage income during period t, 3 pot “ 1. 

ft,-—; - ! -. i> 0. (2) 

n <«♦ 

F jt is the forward real interest rate in period t on bonds that will be 
issued in period t + j — 1 and that will mature in period t + j, and a 
and fi are parameters satisfying 0 < a < I and 0 p. < 1. 

Equation (1) has a straightforward interpretation. Households treat 
the term in brackets as wealth, consuming the fraction a of it every 
period. Wealth equals A t - 1 , the market value of all assets that have 

1 Blanchard’s model is discussed only cursorily here. For a more complete under¬ 
standing of the model, see Blanchard’s paper. The model here is mitten in discrete 
time rather than in continuous time, the notation differs somewhat from Blanchard’s, 
and disposable wage income is stochastic rather than nonstochastic. 

2 This is a discrete-time, stochastic analogue of Blanchard’s eq. (25). Any of the 
following assumptions would yield a consumption function similar to (1): the risk in 
wage income is completely diversihable so that households can completely insure 
against it; the momentary utility function is quadratic, and the real interest rate is 
constant; households have constant relative risk aversion, and conditional on the infor¬ 
mation known to them in period t, their consumptions in each period t + i are iognor- 
mally distributed with a variance that depends only on i; and households have constant 
absolute risk aversion, and conditional on the information known to them in period 
t, their consumptions in each period t + i are normally distributed with a variance 
that depends only on i. 

’ For simplicity, it is assumed that all taxes are nondistorttng. Given the assumptions 
made in n. 12, no loss of generality results. For further discussion of taxes, see nn. 5 
and 6. 
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been accumulated, plus W, + R,A,~ t , current disposable income, plus 
(1 “ ix,)%,E,W 4+ i, the expected present value of the future dis¬ 
posable wage income that will be received by current households. If 
M- > 0, households discount taxes at a higher rate than they dis¬ 
count future interest income. In other words, one unit of taxes in 
period t + t has the present value (1 - p. which is smaller than 
the present value of one unit of interest income. 4 
The aggregate budget constraint is 

W, + R,A,-i = C, + AA ( , (3) 

where A is the difference operator. Equation (3) states that aggregate 
disposable income W, + R,A,- j can be either consumed or ac¬ 
cumulated as assets. 5 It is assumed that the following variant of the 
expectations theory of the term structure holds: 

EtRt+iAt+j- j = Fj t E t A,+i~ j. (4) 

Using equation (3) to eliminate W„ W,+ ,, W t+2 , ... from equation (1) 
and substituting from equation (4) results in 6 

90 

C t =* a ^ (1 — n)‘fi, t E{(Ct+i + (5) 

i-O 

Consumption is therefore increasing in E t A„ E,A ,+ 1 , E,A t+i , . . . 
unless Ricardian equivalence holds and p. = 0. Consequently, the 
higher households expect the future path of the government debt to 

be, ceteris paribus, and hence the higher are E,A„ E,A, +,, E,A, . . . 

the more households consume. 

4 In Blanchard's paper, p is the rate at which households “die” and are replaced by 
new households with which they are entirely unconnected. Therefore, households alive 
in period (expea to pay only the fraction {I - p)‘ of the aggregate tax in period t + i. 
In contrast, they expea to receive the entire present yalue of the interest payments on 
the government debt. If p is zero, households act as if they live forever or, alternatively, 
as if they perfectly internalize the consumption decisions of future generations as in 
Barro (1974). If p is somewhat above zero, households act as if they have long, but 
finite, horizons and are somewhat disconneaed from future generations. If p is nearly 
one, households act as if they are even disconneaed from their own future biological 
selves. Therefore, p measures not only the finiteness of life and the disconnectedness of 
generations but also the myopia with which households foresee future taxes. In addi¬ 
tion, p serves as a metaphor for how imperfectly human capital markets operate. 

* in eq. (S), A,- 1 is defined to include the real monetary base; hence the net taxes 
subtracted from wage income to obtain disposable income include seigniorage in an 
amount equal to the nominal interest rate times the inherited real monetary base. 

* If the government levies distorting taxes on the income from assets, the real hold¬ 
ing-period yield R h the real forward interest rates {F,,}, and the discount factors {P,.} 
should be defined net of tax. These changes affect neither the theoretical nor the 
empirical analysis below. It it assumed that the following transvertality condition holds: 
Hny rtt (1 - pyp„£,A,+, » 0. This condition is satisfied if there are no bubbles in 
financial markets, if the primary budget deficit is stationary, and if p > 0. If Ricardian 
equivalence holds, more stringent conditions on the stochastic properties of the pri¬ 
mary budget deficit are necessary. 
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B. Empirical Implication if Forward Real Interest Rates 
Are Constant and Equal 

A common assumption in the consumption literature is that forward 
real interest rates are constant and equal at every horizon. This as* 
sumption implies that for every i and t 

frv - P'. (6) 

where p is a parameter satisfying 0 < 0 < 1. Substituting equation (6) 
into equation (5) yields 

30 

C t - a X a - H)W(C <+ , + (7) 

i-0 

Lagging equation (7) one period, multiplying both members by 
1/P(1 - ft), subtracting the resulting equation from equation (7), and 
rearranging yields 7 

+ <8) 

where 

sc 

U, m a X (1 ~ " £/- i)(C/+, + (9) 

>-0 

By construction, U, is uncorrelated with all information available to 
households in period t - 1 and hence with C,_ 1 and A, _ j. Therefore, 
the ordinary least squares estimator of the coefficient on A,_ 1 has a 
zero probability limit if Ricardian equivalence holds and a negative 
probability limit if Blanchard’s alternative holds. Section IV tests this 
empirical implication. 

C. Approximate Empirical Implication if Forward Real 
Interest Rates Vary 

Taking logarithms of both members of equation (5), using equation 
(2) to eliminate p^, and rearranging yields 

In C, - In a + ln(C< + p.A,) 

( 10 ) 

* ‘ 

+ In 2^1 + X (1 “ •*>' ex p( 2 */')]• 

where ■» A ln(C, +l + p.A, +l ) - f, and f, * ln(l + F„). Following 
Campbell and Shiller (1986), the Appendix shows that equation (10) 

7 This it a discrete-time, stochastic analogue of Blanchard's eq. (8). 
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can be approximated as 

A Jn C t - r, * k — |M> (■gj =- j") + “<> 01) 

where r, * ln(l + R t ), y ■ (l — (x)exp(S), x is the unconditional mean 
of Xu, n»i-[(l - -Y)/^]In [e»/{l - *y)j, and 

GC 

w« * X V(£f - £/-i)A ln(C, +> + M«+<) 

i-0 

OD 

- (1 - £ f _i)r f - ^ V(1 - E'-x)f>t (12) 

BO 

~ (■£<- if/ ~ /)/- l) “ X I fit ~ fi+U-i)- 

1=1 

By construction, Sr=oY'(£< - i)A ln(C, +I + Mi+i)>0 - E,. x )r„ 
and 2r=, V(1 - are uncorrelated with all information avail¬ 

able to households in period t - 1. It is assumed that the term premia 
E-t~\r t - ft,- i, E, - tfu ~ ht- i. E t . tfs, ~ fn- 1 . • • • contribute negli¬ 
gibly to the variance of u,. This will be true if the expectations theory 
of the term structure holds. It will also be approximately true if the 
expectational errors in equation (12) are much more variable than the 
term premia because, say, households cannot accurately predict the 
future evolution of r lt fu,fihfa ,... or A ln(C, + p.A ( ). In either case, 
u, can be taken to be serially uncorrelated and uncorrelated with 
At-t/C,.. i as well. Therefore, the ordinary least squares estimator of 
the coefficient on A,-\IC,-\ has a zero probability limit if Ricardian 
equivalence holds and a negative probability limit if Blanchard’s alter¬ 
native holds. 8 Section IV tests this empirical implication. 


D. Empirical Implication for Intervention Analysis 


Suppose that ordinary least squares is used to estimate a regression of 
the form 9 


c, 


n n 

*o + X + X **<*»-« + 

i= 1 1-0 



+ 




i-0 


ttjfigr-i +■ 


(13) 


8 The sum m the right-hand member of ecj. (A6) and hence eq. (10) exists only if 
7 < 1. Since y must also be positive, p > 0 if and only if -(1 - y)ti/y < 0. 

8 Note that current and past ratios of output to nondebt assets are implicitly included 
in eq. (IS) because past values of c and current and past values of A In A and g are 
mduded. 
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where D, is the stock of government debt, K, m A, - D b c t m C,/K h 
d, m D,/K„ g, is the ratio of government purchases to K,, the w’s are 
regression coefficients, n is a nonnegative integer, and the 0’s are the 
residuals from the regression. Let 1, and X, be the set of information 
used by households in forming expectations at time t and the set of 
regressors used in equation (13), respectively. The Appendix shows 
that if X, is a subset of 1„ then 

00 

plim t>, * ~ Ed t+ ,\X t ) + e, (14) 

to a first approximation. In equation (14), the parameter <|> satisfies 
0 < <|> < 1, and the error term e t incorporates all effects on consump¬ 
tion that do not result from revised expectations of the future path 
of d, and that cannot be predicted using X,. 

Consider an intervention that, ceteris paribus, leads households to 
predict a higher (lower) future path for d, than the one that can be 
predicted using X, alone. Because the ceteris paribus restriction re¬ 
quires e t to be zero, equation (14) implies that plim 0, must be positive 
(negative) if |x > 0 and must be zero if p, = 0. Sections V and VI test 
whether the residuals from a regression of the form (13) do indeed 
behave in this fashion. 


in. The Data 10 

This section describes the data used in the empirical analysis and 
discusses a statistical problem associated with their use. 

As in Hall (1978) and others, real expenditure on nondurable 
goods and services is used in lieu of consumption. 11 Using this mea¬ 
sure enables one to avoid the difficult task of estimating the service 
flow from the stock of consumer durables but is strictly valid only 
under rather restrictive conditions. 12 These data and most of those 


10 The data and the programs used for this paper are available on request. 

11 The consumption series used is the sum of National Income and Product Accounts ser. 
1-2.4 and 1.2.5. A measure that exclude* real expenditures on clothing and shoes ha* 
also been tried because Darby (1974) has argued that these goods are durable. Using 
this alternative measure alters no conclusions. 

** Sufficient conditions are that the components making up real expenditure on 
nondurable goods and services have constant relative prices so that they can be treated 
as a Hicks composite commodity and that the momentary utility function be separable 
between this composite commodity and the service flows from consumer durables. It is 
also assumed that the momentary utility function it separable between the consumption 
of private goods and services and the consumption of govemmentally provided goods 
and services and that it h separable between the consumption of market goods and the 
consumption of nonmarket goods including leisure. See Aadiauer (19S5) for a paper 
relaxing the former assumption and MaCurdy (1981) for a paper relaxing the latter 
assumption. Note that the latter assumption permits one to abstract from the distor- 



ggo JOURNAL OF POLITICAL ECONOMY 

described below come from the Notional Income and Product Accounts 
(NIPA), 1929—82, and the July 1986 edition of the Survey of Current 
Business. 

The real federal interest-bearing debt is calculated by dividing 
Cox’s (1985) measure 19 of the nominal market value of the privately 
held federal debt at quarter’s end by the consumer price index in the 
last month of the quarter and then seasonally adjusting with the mul¬ 
tiplicative seasonal adjustment program in TSP. The real monetary 
base is calculated by dividing the sum of currency in circulation and 
deposits of banks at the Federal Reserve (source: Banking and Mone¬ 
tary Statistics, 1941-70, and Annual Statistical Digest , various years) by 
the consumer price index and then seasonally adjusting. The data on 
currency in circulation, deposits at the Federal Reserve, and the con¬ 
sumer price index are for the last month of each quarter. The nomi¬ 
nal debt of the state and local governments is calculated by cumulat¬ 
ing the nominal budget deficit of the state and local governments 
(N1PA ser. 3.3.26) and assuming that the net nominal debt of the 
state and local governments was $5.3 billion at the end of 1946: IV. 14 
Dividing by the consumer price index in the last month of the quarter 
then yields the real debt of the state and local governments. The 
series D is the sum of the real federal interest-bearing debt, the real 
monetary base, and the real debt of the state and local governments. 

The end-of-quarter domestic privately held stocks of nonresiden- 
tial capital, residential capital, and consumer durables are inter¬ 
polated from end-of-year data (Survey of Current Business, January, 
August 1986). In the interpolation, it is assumed that depreciation 
occurs at a constant rate each year and that the real expenditures on 
nonresidential capital, residential capital, and consumer durables 
(NIPA ser. 1.2.8, 1.2.10, 1.2.3) each quarter constitute the gross in¬ 
vestments in these stocks for that quarter. The nominal quarterly net 
stock of foreign assets is calculated by cumulating nominal net foreign 
investment (NIPA ser. 4.1.22) and benchmarking the series to the 
estimated value at the end of 1984 (see 1986 Economic Report of the 
President, table B-103). Dividing by the consumer price index in the 


tiom induced by the taxation of wage income. Finally, it is assumed that households 
treat the real stock of base money as if it were a consumer durable. Given this assump¬ 
tion and the separability assumption maintained above, one can abstract from the 
distortions induced by the inflation tax. 

11 Michael Cox has kindly provided me with updates to his published data. 

14 This figure equals 0.1 /0.0187. The figures $0.1 billion per year and 0.0187 percent 
per year are the net interest paid by state and local governments in 1946 : IV and the 
average yield on Standard and Poor's municipal bonds during 1946:IV. These data 
come from NIPA ser. 3.3.18 and from table 12.12 of Banking and Monetary Statistics. 
This initial condition does not much affect the properties of the series D and A. 
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last month of the quarter then yields the real stock of foreign assets. 
The series A is simply D plus the real net privately held stocks of 
domestic nonresidential capital, domestic residential capital, domestic 
consumer durables, and foreign assets. 

The NIPA series 1.2.18 is used as the measure of real government 
purchases. Because this series and the consumption series are average 
rates of flow over each quarter, they are denoted £ and S. The series 
c, g, and d are CIK, GIK, and DIK. 

The average continuously compounded ex post real interest rate 
from quarter t - 1 to quarter t is calculated as 

f, - ln[l + (1 - T,_,)TBR t _,] - A In P t , (15) 

where TBR,_ j is the average 3-month Treasury bill rate during quar¬ 
ter t — 1 (see Banking and Monetary Statistics and Annual Statistical 
Digest), r t _ i is the average marginal tax rate expected during quarter 
t - 1 to prevail over the next 3 months, and P, is the implicit price 
deflator for expenditure on nondurable goods and services. 1 * Six 
measures of t have been tried: 0, .1, .2, .3, .4, the ratio of personal 
income tax revenue to personal income less transfer payments, and 
one minus the ratio of Moody’s Aaa municipal bond rate to Moody’s 
Aaa corporate bond rate. Because the inferences drawn below 
are unaffected by which measure is employed, only the estimates for 
t = 0 are reported. 

Equations (8) and (11) were derived assuming that C and A are 
chosen simultaneously. However, because the consumption series C 
used in the empirical analysis is averaged over a quarter, it actually 
results from choices made before the end-of-quarter stock of assets is 
realized. Consequently, the error term U, in the equation 

C t = 8C,_, + 9A,_, + U, (16) 

is serially correlated and correlated with C,_ i and A,- j as well. 16 Simi¬ 
larly, the error term S, in the equation 

A In C, - r, = 4* + ^ (17) 

' C„, / 

is serially correlated and correlated with A,- \!C t ~ j. In equations (16) 
and (17), 8, 9, «|«, and u> are parameters. 

The Appendix demonstrates that under the null hypothesis of 
Ricardian equivalence U, is a first-order moving average, C,_ 2 and 

18 The series P is the sum of NIPA ser. 1.1.9 and 1.1.4 divided by the sum of NIPA 
ser. 1.2.9 and 1.2.4, 

16 1 am indebted to Pok-Sang Lam for this observation. See Christiano, Eichenbaum, 
and Marshall (1987) for further discussion of the biases that result from fitting con¬ 
sumption models to time-aggregated data. 
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4f~s:Rre uncorrelated with U,, and 0 = 0. In contrast, if Blanchard's 
alternative hypothesis holds, 0 < 0. Therefore, under the null hy¬ 
pothesis, equation (16) can be consistently esdmated using C f _ 8 and 
A t - S as instrumental variables for C,_ j and A,_i, and Ricardian 
equivalence can be tested against Blanchard’s alternative by examin¬ 
ing the estimate obtained for 0. The Appendix also establishes similar 
propositions for u, and w. 

Because the implication derived in Section HD does not depend on 
consistent estimation of equation (13), using time-aggregated data 
does not affect how one should carry out the intervention analyses of 
Sections V and VI. 

IV. Two Tests of Ricardian Equivalence 

The estimates reported in tables 1 and 2 result from fitting equa¬ 
tions (16) and (17) to the data described in the previous section for 
the sample periods 1947:II-1985:IV, 1947:11-1966:11, 1966: 
III—1985;IV, 1947:11-1959:IV, 1960:1-1972:IV, and 1973:1- 
1985: IV. Each regression was fitted using the following instrumental 
variables: the constant term, C,- 2 , and A,_ 2 for equation (16); and the 
constant term and A ( _ 8 /£«-2 for equation (17). Because the error 
term in each equation should be a first-order moving average under 
the null hypothesis, standard errors calculated in the usual manner 
are inconsistent and are thus not reported. Instead, Hansen’s (1982) 
generalized method of moments was used to calculate statistics 
for testing whether the coefficients onA,-i in equation (16) and on 
A,_ V /C,_ 1 in equation (17) are each zero. The marginal significance 
levels of these test statistics are reported in tables 1 and 2. 

If Blanchard's alternative holds, the estimated coefficients on A ,-1 
in table 1 and on in table 2 should be negative. In fact, in 

table 1, only two of six estimates are negative, none of the negative 
estimates is statistically significant at the .05 level, and two of the 
positive estimates are statistically significant. In table 2, only three of 
the six estimates are negative, none of the negative estimates is statisti¬ 
cally significant, and one of the positive estimates is statistically 
significant. Therefore, on the basis of the evidence in tables 1 and 2, 
Ricardian equivalence cannot be rejected in favor of Blanchard’s al¬ 
ternative. 

V. Intervention Analysis of the Postwar Tax Cuts 

The interventions considered here are the major tax changes enacted 
during the postwar period. Because observing the legislative process 

17 Ordinary least squares yields similar estimates. 
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TABLE I 

Instrumental Variable Regressions of Consumption on Lagged Consumption 

and Lagged Assets 


Sample Period 

Coefficient on 

Marginal 
Significance 
Level forA-i 

C,-, 

A t -1 

1947:11-1985: IV 

.986 

.0047 

.093 

1947:11-1966:11 

1.048 

-.0080 

.089 

J966:I1I-1985:IV 

.830 

.0393 

.011 

1947:II-1959:IV 

1.038 

-.0060 

.506 

1960:1-1972:IV 

1.007 

.0005 

.927 

1973:I-1985:IV 

.807 

.0442 

.007 


Non.—The instrumental variables are the constant term. C,-,. and A ,~2 Each regression was estimated with a 
constant term, which is not reported here. 


TABLE 2 

Instrumental Variable Regressions of the Growth Rate or 
Consumption on the Lagged Ratio of Assets to 
Consumption 


Sample Period 

Coefficient 
on -i 

Marginal 

Significance 

Level ford,- 

1947:11-1985:IV 

.0342 

.035 

1947:11-1966:11 

.0454 

.151 

I966:II1-I985:IV 

-.0511 

.633 

1947:11—1959: IV 

.0845 

.171 

1960:I-1972:1V 

-.0173 

.412 

1973:1-1985:IV 

-.1023 

.428 


Non—The instrumental variables are the constant term and A,-,/Ci-i- Each regies- 
won was estimated with a constant term, which is not reported here. 


should enable tax changes to be predicted at least somewhat in ad¬ 
vance, households should have more information about the future 
path of d t , the ratio of the government debt to nondebt assets, than 
that provided by X t , the regressors in equation (13). In particular, 
prior to tax cuts (hikes), households should typically expect lower 
(higher) tax rates and hence a higher (lower) future path for d, than 
what can be predicted using X, alone. 18 Suppose that prior to tax 


18 It « assumed here that tax revenue is an increasing function of tax rates. The 
possibility that the economy is on the declining segment of the Laffer curve is therefore 
ruled out. In addition, it is assumed that taxes are changed not only to finance expected 
future changes in government purchases and to respond to changes in X, but also to 
accomplish other purposes. In terms of eq. (14), the assumption in the text together 
with these assumptions implies that on average £d,+,|/, - £<f, +j |X, is positive prior to 
tax cuts and negative prior to tax hikes. 
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Ratios for Two Tax Cuts 


N 

Kennedy-Johnson 
Tax Cut (1964:1) 

Reagan Tax Cut 
(1981:111) 

1 

-.66 

-3.18* 

2 

-.47 

-2.25* 

S 

-.38 

-1.83 

4 

-.33 

-1.59 


* SuiitucaUy significant «t the .05 level. 


changes the systematic effect that this altered expectation has on con¬ 
sumption dominates all other systematic effects. 19 In that case, the 
residuals from fitting equation (13) should tend to be zero before tax 
changes if Ricardian equivalence holds and should tend to be positive 
prior to tax cuts and negative prior to tax hikes if Blanchard’s alterna¬ 
tive holds. 

Equation (13) was fitted over the sample period 1948:11-1985:1V 
using least squares and setting r - 4; /-ratios of the form 

N- 1 

X *T-i 

T (N, T) m ‘I 9 --. — (18) 

svN 

were calculated, where s is the standard error of the regression (13). 
On the null hypothesis that the anticipation of tax changes does not 
affect consumption, r{N, T) is approximately distributed as standard 
normal for large samples. Testing whether the null hypothesis can be 
rejected in favor of the alternative hypothesis that consumption rose 
in anticipation of a given tax cut or fell in anticipation of a given tax 
hike is then simply a matter of comparing /-ratios to appropriate 
critical values of the standard normal distribution. 

For example, consider table 3, which reports t(JV, T) in the four 
quarters leading up to the Kennedy-Johnson tax cut in 1964:1 and 
the Reagan tax cut in 1981: III. Even though both tax cuts reduced 
personal income tax rates massively, consumption did not rise sig¬ 
nificantly in anticipation of either tax cut. Indeed, the anticipation of 
the Reagan tax cut may have reduced consumption. 

Over the sample period, Congress has enacted 11 major tax cuts 
and nine major tax hikes. 80 The ratio t (N, T) was calculated for each 

'* Id terms of eq. (14), it is assumed that on average prior to tax changes t, is 
negligible in magnitude relative to p[a/(l - a)](£d, + j/, - Ed, +i \X,). See n. 22 for 
further discussion of this assumption. 

*° Table 3-1 of Pechman (1984) was used to assign enactment dates for the tax 
changes. 
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Weighted /-Ratios 
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Weighting 

Scheme* 


AT 



i 

2 

3 

4 

i 

-1.99* 

-1.40 

-1.15 

-.99 

ii 

-2.34* 

-1.65 

-1.35 

-1.17 

iii 

-3.04* 

-2.15* 

-1.76 

-1.52 

iv 

-3.07* 

-2.17* 

-1.77 

-1.54 

V 

-2.37* 

-1.67 

-1.37 

-1.18 

vi 

-3.81* 

-2.69* 

-2.20* 

-1.90 


* llic weighting schemes are (i) ux cuts 1 and tax hikes 0, (it) tax cuts 0 and tax hikes - l r Oil) tax cuts 1 and lax 
hikes -1. (iv) tax cut* 4*, and tax hikes 0, (v) tax cuts 0 and tax hikes P„ and (vi) both ux cuts and ux hikes P,. The 
weights {/*,} are the italic revenue losses from table 5-1 of Pechmxn (1994) as a percentage ol GNP. 

' Statistically' significant at the .05 level. 


of the four quarters leading up to each of these tax changes. Table 4 
reports weighted averages of these /-ratios, calculated using the for¬ 
mula 


=/»r\ _ £ift|T(N, Tj) 

w • (S,n?r • (,9) 

where {T,} and {fl,} are the enactment dates and the weights for the 
tax changes. If the /-ratios for the ux changes are independent of 
each other 21 and if the null hypothesis holds, then t (N) is approxi¬ 
mately distributed as sUndard normal for large samples. 

Six weighting schemes were used: (i) ux cuts 1 and tax hikes 0, (ii) 
Ux cuts 0 and Ux hikes - 1, (iii) Ux cuts 1 and Ux hikes - 1, (iv) ux 
cuts P, and tax hikes 0, (v) ux cuts 0 and tax hikes P„ and (vi) both tax 
cuts and Ux hikes P,. The weights {P,} are the sutic revenue losses 
from uble 3-1 of Pechman (1984) as a percenuge of nominal GNP. 

If the anticipation of tax cuts increases consumption and if the 
anticipation of tax hikes decreases consumption, then the weighted 
/-ratios in uble 4 should tend to be positive, and many should be 
significantly so. In fact, all are negative, and many are statistically 
significant. This finding is further evidence that Ricardian equiva¬ 
lence cannot be rejected in favor of Blanchard's alternative. 22 


*' This assumption is only approximately true in this application because a lew of the 
tax changes occurred within a year of each other. 

**The results reported in table 4 would actually enable one to reject Ricardian 
equivalence in favor of a model in which households have negative death rates and thus 
live longer than forever. Because this inference is absurd, these results can be inter¬ 
preted only as lack of support for Blanchard’s alternative to Ricardian equivalence. The 
apparent negativity of p does suggest that something else is going on, however. Accord- 
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TABLE 5 

/-Ratios for Residuals Cumulated from 1981: IV to the 
Indicated Date 


Year 


Quarter 


I 

11 

III 

IV 

1981 




-.13 

1982 

-.44 

-.48 

-.55 

-.17 

1983 

-.21 

.04 

.11 

.11 

1984 

.16 

.14 

-.10 

-.23 

1985 

-.18 

-.22 

-.37 

-.22 


VI. Did the Reagan Tax Cut Raise Consumption? 

Table 3 shows that the anticipation of the Reagan tax cut did not raise 
consumption. One might argue that this occurred, not because Ricar¬ 
dian equivalence holds, but because households did not realize that 
the government debt would grow rapidly after the Reagan tax cut 
went into effect. If so, then households must have learned about the 
rapid growth after the tax cut was enacted in 1981: III. Therefore, if 
p. is appreciably positive and if there were no other systematic effects 
on consumption, equation (14) implies that the residuals {$,} should 
have tended to be positive and appreciable after 1981: III. 

Table 5 reports the 1-ratios of the residuals cumulated from 
1981: IV to each quarter between 1981: IV and 1985: IV. These 
t-ratios are not significantly positive at any conventional level. Indeed, 


ing to eqq. (A 15) and (A 16), if Ricardian equivalence holds, 

plimi), * - -( T -L-)(( r 5_)£ V(/„ - Ef„\X,) 

+ a £ [(__£_) £ X'<£/,» .11/ - Efa+,\X,) 

- (EA In A’,-,,)/, - EA In A, +l |X,)]j. 

where the parameter ? is the unconditional mean of c„ and X, which satisfies 0 < X < 1, 
is defined in the Appendix. If tax cuts (hikes) raise (lower) after-tax real forward 
interest rates and if Ricardian equivalence holds, the anticipation of a tax cut (hike) may 
well produce negative (positive) values of plim x>,. In other words, the ceteris paribus 
restriction may be a poor approximation when Ricardian equivalence is a good approx¬ 
imation. See Judd (1987) for evidence that the typical postwar tax cut (hike) has de¬ 
creased (increased) consumption if Ricardian equivalence is a good approximation. 
Kormendi (1983) has also found evidence that consumption decreases as the govern¬ 
ment debt increases. He argues that this effect arises because each household’s future 
taxes are uncertain even though its interest receipts are not. See Chan (1983) for a 
demonstration that Kormendi’s intuition may be sound. 
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most are negative. Therefore, the Reagan tax cut appears not to have 
raised consumption, and Ricardian equivalence cannot be rejected in 
favor of Blanchard’s alternative. 

VII. Conclusions 

Using quarterly postwar U.S. data, this paper has tested Ricardian 
equivalence against Blanchard’s alternative model. None of the tests 
supports this alternative. Ricardian equivalence may therefore be a 
reasonable approximation. 

Ricardian equivalence implies that for any given path of govern¬ 
ment purchases, shifting between tax and debt finance does not affect 
aggregate demand. Consequently, output, the price level, interest 
rates, and the exchange rate should not be affected. These implica¬ 
tions of Ricardian equivalence have received much empirical support 
in the literature. For example, Dwyer (1982) has found no evidence 
that output, the price level, and short-term nominal interest rates are 
related to budget deficits; Plosser (1982, 1987) and Evans (1985, 
1987a, 19875, in press a, in press b) have found no evidence that 
budget deficits raise interest rates; and Evans (1986) has found no 
evidence that the U.S. budget deficit raises the exchange value of the 
U.S. dollar. 

Appendix 

Heuristic Demonstrations 

Derivation of Equation (11) 

Equation (10) can be rewritten as 

In C, = ln(C, + pA,) + In a + In£,z„ (Al) 

where 

■ i 

z, * 1 + ^ (! - It)' expQ>] xA (A2) 

I-1 V-i ' 

Hence, 

i » n 

-~ 2 ~ = (i - pv e *p(2 */') X* 1 ~ »*>* exp (Z (A3) 

Evaluating z, and this derivative at xn * xg, = *j, = ... = * produces 
1/(1 - y) and y'/(l - y), where y » (1 - p)exp(S). Consequently, expanding 
the left-hand member of equation (A2) to first-order terms around *i, 
= *2f = *s» = ■ • • “ * yields 
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Substituting the approximation <A4) into equation (Al) then results in 

InCi * ln(C, + | iA t ) + ln|~—) + Injl + J Y%(*ii ~ *)]• (A5) 
If |2*-i y£ f (*n - S)| < 1, then the approximation 

In C, * ln(~—] + ln(C, + Mt) + X Y£ito - 3E) (A6) 

V 1 “ If >-i 

is also dose. Lagging the approximation (A6) one period, multiplying each 
member of the resulting approximation by l/y, and subtracting this approxi¬ 
mation from the approximation (A6) yields 

CO 

In C, - ^ In C,-i A In + ln(C, + Mi) + X V^ifoi " *) 

- i[ln (-]-“) + + Mi-i) 

ao 

+ X 1 (*./-! - *)1 

»-l i 

PC 

* k + ln(C, + Mi) + X Y(£i*u - £f-i*i+u-i) 


— {ln(C<-i + Mi-1) + y[£i-ihIn(C, + Mi) ~ fu-i]) 

y 


or 


x + /u-i - | - j ln(C|-i + Mi-i) 

aa 

+ X V(£| - £»-i)A In (C/+, + Mi+i) 

• -0 
UP 

+ X Y(/« - /<+»'-j) 

- ^ 1 - ^ ~ j ln(C,-i + Mi-i) + u, 

* - + ’• - (■ i r i )( inc -' + 4 1+ K'fc;)]) + 


(A7) 


• -0 
OD 

I 

i-1 
= K + T, 


* K + 


A to C, - r, * « - + «■• 


(A8) 
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Derivation of Equation (14) 

Dividing both members of equation (5) by K, and rearranging yields 


999 


n(l + d,) 


* - (rh-){* 

+ £ a - to*, + ^(i ♦ *«+.)]}, 


(A9) 


where c, ■> C t IK, and d ,» D,/K,. Using equation (2) to eliminate ft,, one obtains 


Ud + d,) 


(r^)(< 

+ y^ (I - (t)'£,exp|^ (A In K l+j - f jt ) j [c (+j + (1.(1 + d t+ ,)]j. 


(A 10) 


Linearizing equation (A 10) around Ain X <+J = A In K t+i = A lnX, + 3 = ... = 
k,fu — fit — fit — — f,c t +\ — c,+j = e (+ 3 = ... — c, and d 1+ i = d l+ % = d,+j 

- ... - d results in the approximation 


06 

c, * l + ~ 5) + Z V£ ' {l(C/+. - *) + - 

+ t(AlnX <+ , - A) - (/., -/)]}). 


d) ] 
(All) 


where X «* (1 - (t)exp(A - /), S and A are the unconditional means of d, 
and A In K„ and the parameter / is chosen so that c is the unconditional 
mean of c,. it is assumed that all the sums exist and hence that 0 < X < 1. It 
is also assumed that [a/(l — a)] 1 X‘ < 1, or a + X< l,so that (A11) has 

a stable solution for c,. 

If X, is a subset of f„ then applying the operator 1 - £(-|X,) to both 
members of the approximation (A 11) and rearranging produces 

PC 

c t - Ec,\X t * 2 X 'E, ([(£ci+,|/» - Ec l+ ,\X') 

+ |t(£d, +i |/, - Ed, + ,|X,)1 

(A 12) 

+ [ ? - [' ■ jy -- ] l(£AIn K,+,\l, - £AInX,+,|X,) 

- (fu ~ £/it|X,)]} 

since d, is in X,. Updating the approximation <A11)* periods, applying the 
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operator £(-|/,) - £(• |X t ), and rearranging the resulting approximation pro¬ 
duces 

Ec {+i \I, - Ec, +t \X, * - Ed t+i \X') 

OB 

+ £ A> {[(£<;, +1+J |/, - Ec l+i+J \X.) 
i-i 1 

+ |».{£d/ + i+j|// — £d/ + i+;|X,)] 

+ [ ? + - f -r f ^ -] t(£i<A,S> 

- £AlnK, +1+ ,)X ( ) 

- (Efj,+ ,\1, - £/ >(+1 |X 1 )]Jj. 


Repeatedly using the approximation (A 13) to eliminate £c, + 1 1/ ( - Ec,+ i|X ( , 
£r, +2 |/ f - £r, + J|X„ Ec,+ s\l, - Ec,+ s \X„ .. . from the approximation (A12) 
yields 


(A14) 


X> 

c, - Ec,\X, * 2 (£d, +1 |/, - Ed l+ ,\X,) 

+ a j^. Lt MLt j2. j |^ ( £-A In K„,\h - In AT f+ .|X,) 

oc 

“ (y~) S ~ £/y« + .l^)]} 

- (t^) [ ? 7-v - 5> ] | x '<* - 

where <|» * [a/(l - at)]X, which satisfies 0 < 6 < 1 since a + atX < a + X < 1. 
Because plim ti t * c, - £c,|X„ the approximation (A14) can be rewritten as 

ac 

plim v, = p ^ °— ) 4)'(£d| + ,|/, - Ed, +i \X,) + e„ (A15) 

where 


eo 

+ a V [(£A In K, +i |/, - £A In K, +I |X<) 

OP 

“ (-f“) 2 *Wfi+i\h - £/y/+.|Xt)]}- 


(A16) 



tool 
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Time Aggregation 

Equations (8) and (11) can be cast in the form 

?i “ pyt-i - tr*,-! + t, + «* (A 17) 

where y, is C, or In C„ ! is A,_ ( or A,_ |/C,_ i, z, is zero or r„ c, is U, or u„ p is 
(1 - o)/p(l - p.) or one, and cr is ap./P(l - p.)or(l - y)p,/y. The index t is 
measured in the units of time over which households simultaneously chose 
consumption and the end-of-period stock of assets. Let a quarter consist of n 
of these units of time, and let t be an index of time in quarters. Lagging 

equation <A17) 1,2.n - 1 times, multiplying by p 1 , p ,..., p"" , and 

adding the resulting equations to equation (A 17) yields 

V’ 1 V" 1 ' 

yt - p B ?,-» - o y p'*,-i_i + 1, + y (Ai8) 

,-0 t «0 

where i, is zero or f„ the ex post continuously compounded real interest rate 
between period I - n + 1 and period t. Lagging equation (A 18) 1, 2, . . . , 
n - 1 times, adding the resulting equations to (A 18), and dividing both 
members of the summed equation by n generates 


y, = p"y,_, - <rs,_ B + z, 

+ 


+ % v - <■■>*-—]• 


(A19) 


where 5 , * ( 1 /n) X"^d y,-jl 2 , is zero or f„ the average ex post continuously 
compounded real interest rate between each period in the quarter ending at 
t - n and each period in the quarter ending at t\ and 

(1 “ * It (p '' '] 

n n -I 

+ ^ (l - p'K*,., - *,_„) + 2 ^ (p’ - p- *«-„)} 

1-1 1-1 1 

- (Mrhh - 1,(1 ' '»*-■ + 1 “ - ■€ 


(A20) 
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Substituting equation (A20) into equation (A 19} then yields 

% - p"*— - ( iL —+ l, + *„ (A21) 


where 


H- 1 


-2 fcfcf)--*■]*. 

n w — 1 

+ X (1 “ P*)«<+l- + X “ p") «/-» + !-»}• 

i-t i«l 1 

Reindexing equation (A21) results in 


(A22) 


(A2S) 


Clearly, the error term € T is serially correlated. Suppose that Ricardian 
equivalence holds so that a = 0. Equation (A22) implies Et r i r ~, ** 0 for t = 0 
and 1 and £€ T E T _, = 0 for i > 1. Consequently, L is a first-order moving 
average under the null hypothesis of Ricardian equivalence. Equation (A23) 
then implies thatJ T _ t is correlated with I T _ i and hence with< T . Because J T - i 
and x, _ i are jointly determined, x T - 1 is generally correlated with € T as well. By 
construction, however, «’s dated after t - 2 are uncorrelated with y’s and x’s 
dated t - 2 or before. Therefore, under the null hypothesis, and x T - 2 
ate uncorrelated with l r . One can therefore estimate equation (A22) consis¬ 
tently by using $-2 and x ,_ 2 as instruments for t and x,_ |. Because the 
coefficient on x,_i in equation (A23) is zero under the null hypothesis of 
Ricardian equivalence and negative under Blanchard’s alternative, one can 
test the null hypothesis against this alternative by comparing the estimated 
coefficient on x ( _ 1 with a consistent estimate of its standard error. 

In the first case, 1, =* 0 and aggregation is exact. Equation (A23) therefore 
takes the form (16) in the text. In the second case, p" = 1,(1 - p")/(l - p) -* 
n, and 1 t — f„ the average ex post real interest rate between each date in 
quarter t - l and each date in quarter r. However, aggregation is approxi¬ 
mate because In d T is measured as In C, rather than as the average of In C/s 
and because A t _ i/C t _ 1 is used in lieu of A T _ i/C T _ 1 . To a first approximation, 
equation (A23) takes the form (17). It is hoped that the approximations in the 
second case introduce negligible errors. 

For expositions! convenience, the distinction between the time indices t and 
t is not drawn in the text. The index t is used in Section II in the same way as it 
is here but is used in Sections III-V in the same way as the index t is used 
here. 
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An Analysts of Congressional Voting on 
Legislation limiting Congressional Campaign 
Expenditures 


Bruce Bender 

University of Wisconsin—Milwaukee 


Congressional voting on proposed Boor amendments concerned 
solely with setting the level of the election campaign expenditure 
ceiling provision of the House Administration Committee's broad 
campaign finance reform bill of 1974 is analyzed for consistency with 
either the public-interest or economic theories of regulation. The 
benefit or cost to the individual congressman of a given ceiling is 
defined as the implied increase or decrease in his probability of 
reelection under the ceiling. Logit regression analysis provides the 
preponderant support for the economic theory of regulation by in¬ 
dicating that the likelihood of voting for a given ceiling varies di¬ 
rectly with the implied change in reelection probability under the 
ceiling and is quite sensitive to the implied change. 


I. Introduction 

The U.S. Congress passed the 1974 Federal Election Campaign Act 
(1974 FECA) in October of that year. This act, which was to become 
effective starting with the 1976 elections, contained regulations that 
substantially altered the campaign process. In brief, the 1974 FECA 
provided for partial public funding of presidential campaigns, ceil¬ 
ings on individual campaign contributions, and ceilings on campaign 
expenditures by candidates in presidential. Senate, and House elec¬ 
tions. Not surprisingly, the bill generated much congressional debate 
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and several proposed floor amendments prior to its actual adoption. 
The objective of this paper is to determine whether congressional 
voting On those floor amendments concerned solely with setting the 
level of the expenditure ceiling for House elections in the House 
version of the 1974 F£CA bill can best be explained by either the 
public-interest or economic theories of regulation. 

The congressional campaign expenditure ceiling is an almost 
uniquely interesting piece of regulatory legislation because the con¬ 
gressmen voting on the legislation are the ones direcdy and indirectly 
affected by the legislation. By limiting expenditures the ceiling 
changes probabilities that challengers will unseat incumbents, that is, 
changes the barriers to entry into the “industry." Stiver’s (1971) eco¬ 
nomic theory of regulation, which presupposes that individuals seek 
regulation that serves their private interests, implies that congres¬ 
sional voting on the expenditure ceiling amendments should have 
been dependent on the benefits received by or costs imposed on indi¬ 
vidual congressmen, as measured by the changes in reelection proba¬ 
bilities implied by the ceiling, rather than on congressmen’s percep¬ 
tions of the public interest. It is precisely because the benefits and 
costs of the expenditure ceiling are clearly defined and are directly 
received or incurred by the individual congressmen that an analysis of 
their voting on the ceiling amendments offers perhaps as clean and 
direct a test of the public-interest and economic theories of regulation 
as one could hope to And. 1 


1 Typically in studies explaining the provisions of particular regulatory legislation or 
congressional voting on regulatory legislation, the self-interest of each congressman is 
taken to be a function of how such legislation affects the economic interests of his 
constituents or a subset of his constituents. Measuring the impact on these economic 
interests must be inferred indirectly from demographic and other data. Furthermore, 
translating the impact on these economic interests into an optimal voting response by 
the congressman can be a complicated process, particularly if the regulatory legislation 
confers benefits on some subsets of constituents and costs on other subsets. This paper 
largely avoids these problems and complications because the benefit or cost (the in¬ 
crease or decrease in probability of reelection) of the regulatory legislation is clearly 
defined, is amenable to straightforward measurement, and is direcdy received or in¬ 
curred by the individual congressmen. There has been previous research on certain 
aspects of the 1974 FECA that has reached conclusions consistent with the economic 
theory of regulation. Abrams and Settle (1978) found that public financing and expen¬ 
diture limits in presidential elections worked to the advantage of the Democratic 
nominee and that congressional Democrats were significantly more likely to vote for the 
1974 FECA and sigmficandy more likely to vote against proposed amendments to 
delete presidential public financing than Republicans were. On the basis of their theo¬ 
retical analysis, Aranson and Hinich (1979) concluded that the 1974 FECA’s limitations 
on individuals’ campaign contributions benefited incumbents in general by increasing 
the likelihood of their reelection. Finally, Jacobson (1976) and Kazman (1976) con¬ 
cluded that the 1974 FECA spending limits tended to protect incumbents in general 
because vote share regressions indicated that the expenditure necessary for challengers 
to unseat incumbents exceeded the FECA ceiling. However, since the expenditure 
ceiiing k likely to help or hurt individual congressmen to (Afferent degrees, the anal- 
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Fortuitously, the U.S. Supreme Court provided a natural experl 
ment to perform such a test by ruling in January 1976 that the sectioi 
(608[c]) of the 1974 FECA setting expenditure limits for congres 
sional election campaigns was unconstitutional (Buckley v. Valeo [46 
Ed 2d 659]). Consequently, expenditures in the 1976 congressiona 
elections were unconstrained, thereby making possible estimates c 
changes in the probability of reelection that would have occurrec 
under the campaign expenditure ceilings in the proposed amend 
ments. The estimated change in probability in each election can b 
viewed as the direct benefit or cost to the incumbent of the expendi¬ 
ture ceiling and therefore provides the basis for testing whether the 
public-interest or economic theories of regulation can best explair 
congressional voting on the proposed amendments to the House Ad¬ 
ministration Committee’s version of the 1974 FECA bill. 

The paper begins with the estimation of a vote share regression foi 
a subset of 1976 U.S. House elections. However, an incumbent ' 
concerned with his probability of reelection, and vote share or eve 
expected vote share is not a measure of probability. The actual vott 
share outcome of each election should be viewed as a random draw 
from an assumed normal probability distribution of outcomes gener¬ 
ated by the vote share regression, where the mean and standard dev' 
ation of the distribution are taken to be the fitted value of the vou 
share and the standard error of the vote share regression, respec¬ 
tively. This allows the calculation of an ex ante probability of eac 
incumbent’s reelection. Furthermore, simulations of fitted values oi 
vote share under the various expenditure limits contained in pro¬ 
posed amendments to the House Administration Committee’s bill car 
be used to estimate the probability of reelection that would have oc¬ 
curred under each amendment. The increase or decrease in probabil¬ 
ity of reelection under each proposed amendment can be interpreted 
as the direct benefit or cost to an incumbent of passage of the amend¬ 
ment. If congressional voting were based on narrow self-interesi 
rather than on public interest, then we would expect each congress¬ 
man to vote on each amendment according to his perceived benefit oi 
cost. Logit regressions for congressional voting on the proposed 
amendments provide support for the economic theory of regulation 
by indicating that the probability of voting for an amendment is di¬ 
rectly related to the implied algebraic value of the change in probabil¬ 
ity of reelection under the amendment and also is quite sensitive to 


y*e* of the impacts of expenditure limitations in congressional elections provide only 
“tdirect support for the economic theory of regulation. An analysis of the voting on the 
expenditure ceilings by the individual congressmen is necessary for a strong test of the 
economic theory of regulation. 
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th« magnitude of the implied change . 8 On the other hand, the nega¬ 
tive constant term of the regressions provides support for the public- 
interest theory by indicating that a congressman is less likely to vote 
for than against the expenditure ceiling even in the presence of a 
small increase in reeiection probability implied by the ceiling. How¬ 
ever, congressional self-interest rather than the public interest is the 
more important determinant of congressional voting on the ceiling 
amendments since only a very small implied increase (both in absolute 
terms and relative to the reelection probability in the absence of the 
amendment’s ceiling) is sufficient to induce a congressman to be more 
likely to vote for than against the expenditure ceiling amendment. 


II. The Vote Share Regression Estimates 

The fraction of the vote received by a candidate can be viewed as a 
function of his stock of political capital relative to his opponent’s, 
where political capital is defined as favorable name recognition by the 
voters. Variables that contribute to a candidate’s political capital stock 
and that have virtually universally appeared in vote share regressions 
of previous researchers are measures of incumbency status, party 
strength, and campaign expenditure. All these variables are parame¬ 
ters except expenditure, which is a choice variable of the candidate. 

The specific vote share regression to be estimated for U.S. House of 
Representatives incumbent major party elections is 

I VOTE = b 0 + ii I PARTY + b? CX + 63 RELCX + b 4 Dl + e, (1) 

where IVOTE is the fraction of the total votes cast for the Democratic 
and Republican candidates that is received by the incumbent, DI is a 
dummy variable for a Democratic incumbent, IPARTY is the fraction 
of registered Democratic and Republican voters in the congressional 
district that are registered in the incumbent’s party, CX is the challen¬ 
ger’s campaign expenditure in thousands of dollars, RELCX is the 
challenger’s campaign expenditure as a fraction of total campaign 

* Of course it is possible to determine only whether or not empirical evidence is 
consistent with the economic theory of regulation, it may be that self-interest and the 
public interest coincide, although this is unlikely to hold for this analysis. Economists in 
general would conclude that the public interest would be best served by no ceiling on 
campaign expenditure just as they would conclude that the public interest would be 
best served by no ceiling on advertising for consumer products by individual firms. The 
actual passage of a ceiling on campaign expenditure seems to contradict a public- 
interest explanation of congressional voting. On a more sophisticated level, one could 
argue that the congressmen voted according to their self-interest but their self-interest 
was determined by their perception of their constituents' desire for an expenditure 
ceiling. While this argument cannot be ruled out a priori, it fails to explain why the 
individual congressman’s perception of constituents’ desires for a ceiling should be 
highly correlated with his or her change in probability of reelection under the ceiling. 
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expenditure (i.e., RELCX * CX/fCX + IX], where IX is the incum¬ 
bent’s expenditure in thousands of dollars), 3 and e is the error term. 
The inclusion of CX and RELCX is based on the premise that both 
the level of spending by the challenger and relative spending by the 
challenger influence vote share. 4 Finally, this specification allows for 
the marginal product of challenger spending and the marginal prod¬ 
uct of incumbent spending in the production of vote share each to be 
a function of the levels of both challenger and incumbent spending. 9 

Regression equation (1) was estimated for a sample of U.S. House 
of Representatives elections involving incumbents in 1976. Elections 
were excluded that involved either an unopposed incumbent or third- 
party candidates receiving more than 5 percent of the total vote. Also, 
IPARTY could not be calculated for many elections because data on 
party registration are available for only 24 states and not for all con¬ 
gressional districts within those states. Since the objective of the paper 
is to explain congressional voting on the expenditure ceiling amend¬ 
ments in August 1974, those 28 incumbents who ran for reelection in 
November 1976 but were not in office as of August 1974 were elimi¬ 
nated from the sample. There were, therefore, 72 observations. Ordi¬ 
nary least squares estimates of (1) are presented below ((-statistics are 
in parentheses): 

I VOTE - .6250 + .1915IPARTY - .001386CX 

(20.30) (4.15) (5.36) 

- .08335RELCX, R 2 * .62. 

(171) 


5 The data sources for these variables are Barone, Ujifusa, and Matthews (1974, 
1978). 

* Incumbent expenditure is presumed in large part to be a reaction to challenger 
expenditure. (See Jacobson [1985] for elaboration and supporting evidence.) The im¬ 
pact of incumbent expenditure in terms of reducing the effect of challenger expendi¬ 
ture is reflected by the appearance of IX in the denominator of the relative expenditure 
variable, RELCX. 

5 There is, however, a problem of simultaneity for an ordinary least squares regres¬ 
sion that has been pointed out by previous researchers. Vote share is a function of 
expenditure, but expenditure is likely to be a function of expected vote share. Unfortu¬ 
nately, the use of two-stage least squares estimation is not viable because there docs not 
appear to be an economically meaningful way of identifying a simultaneous system of 
vote share and expenditure equations. It is difficult to conceive of any exogenous 
variables that affect vote share but not expenditure and any exogenous variables that 
affect expenditure but not vote share. (Jacobson [1985] has independently arrived at 
the same conclusion on basically the same reasoning.) Furthermore, when researchers 
have attempted to estimate vote share and expenditure equations simultaneously, they 
have typically found little statistical difference between their two-stage least squares and 
ordinary least squares estimates. (See Palda [1975], Jacobson [1978, 1985], and Palda 
and Palda [1985], Abrams and Settle [1978] report that their “attempt to control for 
this potential simultaneity bias through two-stage estimation techniques was unsuccess¬ 
ful" [p, 254, n. 7].) 
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The estimated coefficients indicate that the incumbent's vote share 
varies directly with the strength of the incumbent’s party in the dis¬ 
trict and inversely with the level of the challenger’s spending and the 
challenger’s spending as a fraction of total spending as expected. 
Since the dependent variable is the incumbent’s vote share, the con¬ 
stant term can be attributed to incumbency (as well as possible omitted 
variables). All fitted values of IVOTE were in the interval [.456, .799]. 
The dummy variable for a Democratic incumbent, DI, was dropped 
from the regression because its estimated coefficient was insignificant 
^statistic of 0.19), and its inclusion left the coefficients of the other 
variables virtually unchanged. The insignificance of the coefficient of 
DI indicates that the mood of the national electorate in 1976 favored 
neither party. 

Calculated marginal products of expenditure in the production of 
the incumbent’s vote share based on the estimated coefficients of (2) 
accord nicely with expectations. Not only is the marginal product of 
incumbent expenditure (MPix) positive and the marginal product of 
challenger expenditure (MP C x) negative, but both expenditures ex¬ 
hibit diminishing marginal productivity as MPix is decreasing in IX 
and MPcx is increasing (algebraic value) in CX. 6 Evaluated at the 
mean values of expenditure (DC = 72.607 and CX - 32.207 in 
thousands of dollars), MPj X ** 2.44 X 10 ~ 4 and MPcx * -1.93 x 
10 An increase of $1,000 of incumbent expenditure increased the 
incumbent’s vote share by .000244, whereas an increase of $1,000 of 
challenger expenditure decreased the incumbent’s vote share by 
.00193. At the mean levels of expenditure, the absolute value of the 
marginal product of challenger expenditure was 7.9 times as large as 
the marginal product of incumbent expenditure. When evaluated at 
levels of incumbent and challenger expenditure equal to the mean 
value of incumbent expenditure, MPj X = 2.87 x 10“*and MP C x - 
—1.67 x 10~ s , implying |MPcxl * 5.8MPix- 

The relative marginal products calculated above indicate that mar¬ 
ginal spending by the challenger has considerably more impact on 
vote share than marginal spending by the incumbent over the rele¬ 
vant range of spending. An interesting corollary question is what the 
minimum amount is that the “average” challenger would have to 
spend to defeat the "average” incumbent. On the basis of the esti¬ 
mated coefficients of (2), that amount for 1976 would have been 
$93,810, which is 2.9 and 1.3 times the mean expenditures by the 

* MPjx - (dIVOTE/3RELCX)(dRELCX/3lX) = -6 S [CX(CX + IX)"*); MP C x * 
(3IVOTE/3CX) + (3IVOTE/3RELCX)(3RELCX/3CX) « b, + i,[(CX + IX)' 1 - 
CX(CX + IX)-*]; («MP lx /«X) - 2i s (CX(CX + IX) - *]; and (3MP cx /3CX) - 
24s[CX(CX + IX)-* - (CX + IX)-*]. A sufficient condition for MP, X to be positive 
and decreasing and for MP CX to be negative and increasing is i* < 0 and bs < 0. 
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challenger and incumbent, respectively. In brief, the calculations 
based on (2) indicate that challenger expenditure has substantially 
greater marginal productivity than incumbent expenditure and that 
challengers typically still must spend large amounts of money if they 
hope to unseat incumbents. With this in mind, the rest of the paper is 
devoted to an analysis of congressional voting on legislation providing 
ceilings on campaign expenditure. 

III. The 1974 Federal Election Campaign Act 

On October 10, 1974, the U.S. House adopted the Senate-House 
conference report (CQ437) 7 on the 1974 Federal Election Campaign 
Act. The bill provided for partial public financing of presidential 
elections and ceilings on campaign contributions by individuals and 
campaign expenditures by candidates for federal office starting in 
1976. The bill set the expenditure ceiling for House candidates at 
$84,000 (indexed for inflation after 1976) for the general election and 
for the primary election, if any. 8 Previously, on August 8, 1974, the 
House passed the House version (CQ336) of the 1974 FECA, which 
was similar to CQ437 except that the expenditure ceiling for House 
candidates was $75,000 (indexed for inflation after 1976). 

There was much public sentiment for campaign financing reform 
at the time since the Watergate scandal was still fresh. Both measures 
passed the House easily: CQ437 was adopted by a vote of 365-24 
(220-6 by Democrats and 145-18 by Republicans), and CQ336 was 
adopted by 355-48 (219-3 by Democrats and 136-45 by Republi¬ 
cans). 

Although there is evidence that expenditure ceilings in congres¬ 
sional elections generally favor incumbents (Jacobson 1976; Kazman 
1976) and that the partial public financing and expenditure ceilings in 
presidential elections tend to favor Democrats (Abrams and Setde 
1978), it is difficult to analyze voting by House members on their own 
campaign expenditure ceilings on the basis of their voting on CQ437 
and CQ336 for two reasons. First, the ceiling on campaign expendi¬ 
ture was only one of many provisions in CQ437 and CQS36. Second, 
congressional passage of the conference report and the final version 
of the House bill was a virtual certainty, and in the light of the public 
mood at the time, it is likely that House members who opposed both 
measures were reluctant to make the futile gesture of going on record 


7 For convenience all bills, amendments, and motions will be referred to by the 
numbers assigned to them by the Congressional Quarterly Weekly Report. 

* The $84,000 limit is equal to a ceiling of $70,000 to be spent for the campaign and 
an additional 20 percent ($14,000) to be spent on fund-raising. 
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as opposing than. In contrast, it is likely that House members wouk' 
vote according to their true desires on the virtually unpublicized 
amendments dealing exclusively with expenditure ceding levels that 
were presented on the House floor prior to passage of CQ336. 

On August 7, 1974, the House Administration Committee’s cam¬ 
paign financing reform bill (HR 16090) was brought to the floor ot 
the House for consideration. This bill contained measures similar to 
those of CQ336 but with a campaign expenditure ceiling of $93,750. 
The manner in which the bill was brought to the floor is particularly 
interesting. Wayne Hays (D-Ohio), chairman of the House Adminis¬ 
tration Committee, obtained a ruling from the House Rules Commit¬ 
tee limiting the introduction of floor amendments to the bill to “only 
amendments dealing with public financing [of elections], a proposed 
political financing supervisory board, and campaign contribution and 
spending limits. The rule prevented Republicans from offering 
amendments banning campaign ‘dirty tricks’ and preventing groups, 
primarily labor unions, from pooling members’ contributions and 
then concentrating the sums on selected candidates and election cam¬ 
paigns” (<Congressional Quarterly Weekly Report , August 10, 1974, p. 
2192). A motion (CQ324) to end debate and the possibility of amend¬ 
ing this rule passed by a vote of 219-190 but almost totally along 
party lines (218-9 by Democrats and 1-181 by Republicans). The 
partisan voting on CQ324 is consistent with the economic theory of 
regulation. Since labor unions generally support Democratic candi¬ 
dates and the bill effectively allowed labor unions to circumvent the 
bill’s other provisions limiting an individual’s contributions and an 
organization’s contributions to $1,000 and $5,000 per election, re¬ 
spectively, the rule limiting the introduction of floor amendments to 
the bill clearly hurt Republicans. In the light of the voting on CQ324, 
the voting on the actual motion (CQ325) to bring the Administration 
Committee’s bill (HR 16090) to the House floor under the restrictions 
of the Rules Committee’s ruling passed easily 330-78 (223-3 by 
Democrats and 107-75 by Republicans). For notational consistency, 
the bill, HR 16090, brought to the floor will be referred to as CQ325. 

Later that day two amendments were introduced with the sole pur¬ 
pose of lowering the expenditure ceiling. Amendment CQ326, which 
proposed a ceiling of $53,125, was defeated 187-223 (103-125 by 
Democrats and 84-98 by Republicans), whereas a subsequent amend¬ 
ment, CQ329, which proposed a ceiling of $75,000, was passed 240- 
175 (130-102 by Democrats and 110-73 by Republicans). These 
votes on CQ326 and CQ329, which are concerned solely with the level 
of the expenditure ceiling and which seem to reflect little, if any, 
partisanship, are likely to reflect the individual House members’ de- 
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sired level of the expenditure ceiling. It is these votes that are the 
objects of analysis of the next section. 


IV. An Analysis of Congressional Verting on 
Campaign Expenditure Ceilings 


The economic theory of regulation implies that each individual mem¬ 
ber of the House would tend to vote on CQ326 and CQ329 according 
to his self-interest. It is assumed here that this self-interest can be 
measured in terms of the change in the probability of reelection in 
1976 implied by the change in the level of the campaign expenditure 
ceiling proposed in each of these two amendments. 9 An increase (de¬ 
crease) in this probability should elicit a vote of yes (no) on each 
amendment. 

Probability of reelection can be inferred in the following manner. 
Consider the incumbent’s actual vote share outcome in 1976 to be a 
random draw from an assumed normal probability distribution of 
outcomes generated by the vote share regression equation (1). Let this 
normal probability distribution for each incumbent be described by 
an expected value equal to the fitted value (FVOTE) for that incum¬ 
bent and by a standard deviation equal to the standard error of the 
regression (S). The probability of reelection for the ith incumbent can 
now be inferred from 


Z, = 


I VOTE, - .5 


(3) 


where Z, is the number of standard deviations that the ith incumbent’s 
expected vote share lies from the minimum vote share of .5 necessary 
to win reelection. 10 Given the estimated coefficients of the vote share 
regression, fitted values of IVOTE, and therefore probabilities of 
reelection, can be simulated for the different levels of the expendi- 


9 It is possible that congressmen will consider the perceived change in the probability 
of reelection not only in 1976 but in later elections as well. It is likely, however, that 
these changes in probabilities for elections beyond 1976 are positively correlated with 
the change in 1976, particularly if the congressman perceives a trend toward improving 
or deteriorating reelection prospects in his district. Furthermore, in the absence of such 
a trend, it is likely to require considerably more foresight in 1974 to assess possible 
electoral success in 1978 than in 1976. 

10 Obviously the truncation of the normal distribution at vote share values of zero 
and cine makes the calculation of reelection probability based on Z, only approximate. 
However, given that the range of fitted values that will be used is (.456, .799J and that 
the standard error that will be used is only .065, the tads will be extremely thin at zero 
and one. The truncation of the tails at zero and one should therefore impart negligible 
bias to the calculations of probabilities based on the values of 2,. 
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ture ceiling (applicable to expenditure by both the incumbent and the 
challenger) under each amendment considered by the House. 

Values of Z, and the corresponding probabilities were calculated 
from (2) for each of the 72 incumbents for 1976 11 under the following 
circumstances: a ceiling of $93,750 (CQ325), a ceiling of $53,125 
(CQ326), and a ceiling of $75,000 (CQ329). A pure and direct test of 
whether congressional voting on campaign expenditure ceilings was 
motivated by self-interest is to run logit regressions of the votes on 
CQ326 and CQS29 on the respective changes in the probability of 
reelection resulting from the lowering of the ceiling from the $93,750 
(CQ325) level in the House Administration Committee’s bill. 

It does not make sense, however, to run these logit regressions for 
the full sample of the 72 incumbent elections because in only 44 of the 
elections was the ceiling of $53,125 (CQ326) binding on at least one 
candidate and in only 29 of the elections was the ceiling of $75,000 
(CQ329) binding on at least one candidate. Assuming reasonably 
good foresight on the part of the incumbents, only those incumbents 
in elections for which the ceiling would be binding would actually 
expect to receive any benefit or incur any cost from the imposition of 
the ceiling. 18 The other incumbents would be relatively more in¬ 
fluenced by considerations such as the impact of the ceiling on the 
long-term electoral success of his party in general. Such considera¬ 
tions are more difficult to measure, and the logit regressions are 
therefore limited to those elections for which the ceiling was binding. 

For reasons that will become clear later, voting on CQS29 will be 
analyzed first. Define V329 as a qualitative variable taking the value 


(> Use of the 1976 election vote share regression (2) implicitly assumes that incum¬ 
bents can reasonably accurately forecast the estimated regression coefficients for the 
1976 elections. This need not be a particularly strong assumption since incumbents are 
the successful “firms" in the political "industry” and part of their success can likely be 
attributable to their ability to make such forecasts. In any event, use of an estimated 
vote share regression for the Watergate election year 1974 would be inappropriate 
because it is unlikely that incumbents would expect the estimated regression for such an 
unusual election year to be valid in 1976, the first election year to be governed by the 
1974 FECA. 

'* An incumbent can be hurt (i.e., his expected vote share can be reduced) by the 
expenditure ceiling under two circumstances: (1) the ceiling is binding on the incum¬ 
bent's expenditure but not on the challenger’s, and (2) the ceiling is binding on both 
candidates’ expenditures but the incumbent's unconstrained expenditure exceeds the 
challenger's unconstrained expenditure by an amount large enough to more than 
offset the higher marginal product (in the production of vote share) of the challenger’s 
expenditure. Similarly, an incumbent can be helped by the expenditure ceiling under 
three circumstances: (1) the ceiling is Unding on the challenger’s expenditure but not 
on the incumbent’s, (2) the ceiling is binding on both candidates’ expenditures and die 
challenger’s unconstrained expenditure exceeds the incumbent’s, and (S) the ceiling is 
binding on both candidates’ expenditure and the incumbent's unconstrained expendi¬ 
ture exceeds the challenger's but not by an amount large enough to offset the higher 
marginal product of the challenger's expenditure. 
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one if the congressman voted yes on CQ329 and zero if he voted no, IS 
and define PS29-P325 as the change in probability of reelection 
implied by a lowering of the expenditure ceiling from $93,750 under 
CQ325 to $75,000 under CQ329. Logit regressions of V329 on 
P329-P325 and DI are 

V329 = -.54 + 25.99PS29-P325, x 2 (I) = 6.80, (4) 

(1.16) (2.24) 

[6.49] 


and 

V329 * -.81 + 24.25P329-P325 + .7SDI, X *(2) * 7.51, (5) 
(1.40) (2.08) (.84) 

[6.06] [.18] 

where f-statistics are in parentheses and slope coefficients evaluated at 
the mean value of the dependent variable are in brackets. 

Regressions (4) and (5) are consistent with the self-interest hy¬ 
pothesis of congressional voting on the expenditure ceiling and pro¬ 
vide a relatively high degree of predictive ability. 14 The significandy 
positive coefficients of P329-P325 in (4) and (5) indicate that an 
incumbent’s likelihood of voting for the CQ329 ceiling varies directly 
with the change in his probability of reelection. Furthermore, the 
voting on CQ329 is quite sensitive to the magnitude of the implied 
change in probability of reelection. The calculated slope coefficients 
of 6.49 and 6.06 indicate that an increase (decrease) in reelection 
probability of one percentage point induces an increase (decrease) in 
the probability of voting for CQ329 of six percentage points. This is 
consistent with the economic theory of regulation in that the voting by 
the individual congressmen on an amendment regulating their own 
activity follows their self-interest. 

The insignificant coefficient of Dl indicates that party affiliation 
had no significant effect on the voting. 15 This stands in contrast to the 
findings of Abrams and Settle (1978) that party affiliation did in¬ 
fluence voting on the overall bill and on amendments concerned with 
public financing of presidential elections. However, the contrast actu¬ 
ally serves to buttress the support for the self-interest explanation of 
congressional voting in their paper and this paper. The benefits or 

13 The data source for voting by individual congressmen on CQ329 is the Congres¬ 
sional Quarterly Weekly Report (August 17, 1974). Data for the construction of the vari¬ 
able V326, for voting by individual congressmen on CQ326, is from the August 10, 
1974, report. 

14 Both regressions correctly predicted 21 of the 29 votes as opposed to 15 of 29 
correctly predicted under the naive prediction rule of “all vote yes.” 

* The insignificant coefficient of DI is unlikely to have been caused by multicoi- 
finearity since the correlation coefficient of PS29 - P325 and DI is only .23. 
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costs of the amendments analyzed by Abrams and Settle are party- 
specific, whereas the benefits or costs of the expenditure ceiling 
amendments are congressman-specific. That party affiliation in¬ 
fluenced voting on the former set of amendments but not on the 
latter set is consistent with the self-interest hypothesis. 

The negative constant term does, however, provide some support 
for a public-interest explanation of congressional voting. It indicates 
that a congressman whose probability of reelection was unchanged or 
even slightly increased by the expenditure ceiling would be less likely 
to vote for than against CQ329. Since economists would generally 
agree that limitations on advertising, whether for goods or for candi¬ 
dates, are unlikely to serve the public interest, a negative constant 
term can be interpreted as reflecting the influence of the public inter¬ 
est on congressional voting. 

The interesting question is to determine the relative magnitudes of 
the influences of public interest and self-interest on the voting. The 
logit regression offers some insight for answering this question. 
Specifically, calculations based on (4) indicate that a congressman is 
more likely to vote for the ceiling in CQ329 (i.e., Pr[V329, = 1] > .5 
for the ith congressman) if the expenditure ceiling implies an increase 
in his probability of reelection that exceeds only two percentage 
points/ 6 Given that the mean value of the probability of reelection in 
the absence of the CQ329 ceiling (i.e., under the CQ325 ceiling) is .90, 
regression (4) suggests that only a relatively small private benefit 
under CQ329 is sufficient to make a congressman more likely to vote 
against than for the public interest. 

Similar logit regressions were run for congressional voting on 
CQ326. The variables are defined analogously to those of regressions 
(4) and (5): V326 is the qualitative variable for voting on CQ326, and 
P326 - P325 is the change in probability of reelection implied by low¬ 
ering the expenditure ceiling from $93,750 under CQ325 to $53,125 
under CQ326. The logit regressions are 

V326 « -.77 + 5.69P326-P325, x 2 d) * 173, (6) 

(2.13) (1.30) 

[1.32] 

18 It can be easily shown that logit regression (4) is equivalent to 

* [ i fftWMC- ' i j] ' - * “' <P329 - 

for the ith congressman. A congressman will be more likely to vote for CQ329 (i.e. 
PrtVS29, = 1] > .5) if (P329-P325), > -Oo/a,. For (4) this threshold value of P329- 
P325 is .02. It should also be noted that .02 may be an overestimate of the threshold 
value because the negative constant term of (4) is not significantly different from zero at 
the .10 level on a one-tailed test. Regression (4) is more appropriate than (5) because 
the coefficient of Di in (5) is quite insignificant. 
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V326 * -.86 + 5.55P326-P325 + .21DI, x*(2) * 1.84. (7) 
(1.86) (1.26) (.33) 

[1.28] [.05] 

Regressions (6) and (7) do not perform as strongly as (4) and (5). 
Although the signs of the coefficients of P326 - P325 are positive as 
expected, they are barely significant at the .10 level on a one-tailed 
test. Furthermore, the x 2 statistic does not allow the rejection of the 
null hypothesis of zero values of all coefficients (except the constant 
term) at the .10 level. Again, party affiliation has no impact on the 
voting. 

In analyses of the voting on CQ326, the sequence of voting on the 
amendments should be kept in mind. First, CQ326, a proposed 
amendment to lower the campaign expenditure ceiling in the com¬ 
mittee bill from $93,750 to $53,125, was defeated. Later that same 
day, CQ329, a proposed amendment to lower the ceiling from 
$93,750 to $75,000, was passed. Since CQS29 passed relatively easily 
(240-175), it is likely that the voting on CQ326 occurred with the 
expectation that CQ329 would be passed in the event that CQ326 was 
defeated. 17 Therefore, it may be more appropriate to regress V326, 
the voting on CQ326, on P326-P329, the change in probability of 
reelection implied by the lowering of the expenditure ceiling from 
$75,000 under CQ329 to $53,125 under CQ326, rather than on 
P326 — P325 as was done in regressions (6) and (7). These logit regres¬ 
sions are 

V326 * -.80 + 14.50P326-P329, x 2 (D - 2.62, (8) 

(2.24) (1.57) 

[3.36] 

and 

V326 = -.91 + 14.33P326-P329 + .23DI, x 2 (2) •- 2.75. (9) 
(1.93) (1.55) (.36) 

[3.32] [.05] 


17 When one is analyzing voting on amendmenu, there is always some uncertainty 
regarding the voters' expectations about proposals of subsequent amendments and 
their chances of being adopted. £nelow and Hinich (1983) have shown that when a 
voter is faced with a sequence of votes on related issues, an expected-utility maximizer 
will condition his vote on any one issue on the actual outcome of the earlier votes and 
his forecast of the outcomes of the subsequent votes. Specifically, they show that the 
forecasts used will be his estimates of the means of the subsequent votes. Given that 
CQ326 and CQS29 were sequentially considered on the same day and given that 
CQ329 passed easily, it is reasonable to assume that voting on CQ328 was conditioned 
on the expectation that CQ329 would be passed if GQSSo was defeated. 
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The estimates of regressions (8) and (9) are stronger than those of 
<6) and (7). The coefficients of P326-P329 are positive, as expected, 
and significant at the .06 level on a one-tailed test. Corresponding 
slope coefficients indicate that a one-percentage-point increase (de¬ 
crease) in the probability of reelection implied by having the expendi¬ 
ture ceiling in CQ326 rather than the one. in CQS29 caused over a 
three-percentage-point increase (decrease) in a congressman’s proba¬ 
bility of voting for CQ326. For yet another time, the influence of 
party affiliation on the vote was insignificant. 18 However, the x z statis¬ 
tic was just barely short of significance at the .10 level in regression 
( 8 ). 

As in the case of the logit regressions for voting on CQ329, the 
negative constant term can be interpreted as reflecting the influence 
of public interest on the voting while the positive coefficient of 
PS26-P329 can be interpreted as reflecting the influence of self- 
interest. Calculations based on (8) indicate that a congressman is more 
likely to vote for CQ326 (i.e., Pr[V326, = 1] > .5 for the t'th congress¬ 
man) if the expenditure ceiling implies an increase in his probability 
of reelection that exceeds five percentage points. In the light of a 
mean value of reelection probability in the absence of the CQ326 
ceiling (i.e., under the CQ329 ceiling) of .95, the threshold value of 
.05 is relatively small. 

The voting on CQ326 does, however, appear to reflect a relatively 
smaller influence of private interest and a relatively larger influence 
of public interest than the voting on CQ329. This is indicated by the 
significantly negative constant term of (8) exceeding (absolute value) 
the insignificantly negative constant term of (4) and by the smaller 
and less significant positive coefficient of the change in reelection 
probability in (8) than in (4). Both of these factors caused the 
threshold value of .05 for P326 - P329 to exceed the threshold value 
of .02 for P329-P325. 

A possible explanation for the smaller influence of private interest 
on the CQ326 voting than on the CQ329 voting hinges on the fact 
that the mean value of reelection probability in the absence of the 
CQ326 ceiling (i.e., under the CQ329 ceiling) of the incumbents vot¬ 
ing on CQS26 is .95, whereas the mean value of reelection probability 
in the absence of the CQS29 ceiling (i.e., under the CQ325 ceiling) of 
the incumbents voting on CQ329 is .90. If the importance of a one- 
percentage-point change in probability is less valuable to an incum¬ 
bent when his initial probability of reelection is higher, then an in¬ 
cumbent’s vote on an expenditure ceiling will tend to be less 

M Again the insignificant coefficient of D1 is unlikely to have been caused by mul- 
tko&nearity since the correlation coefficient of PS26- P329 and D1 is only .08. 
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influenced by a one-percentage-point change in his probability of 
reelection when his initial probability of reelection is higher. 19 That 
private interest should have a smaller influence and public interest 
should have a larger influence on the CQ526 voting than on the 
CQ329 voting is therefore not surprising. The differences in the esti¬ 
mated CQ326 and CQ329 regressions merely reflect differences in 
the values that the two subsamples of incumbents' place on a given 
ceiling-induced change in reelection probability. 

The data do, however, have a shortcoming, and a determination of 
the severity of this shortcoming required further testing. The expen¬ 
diture ceilings apply to primary and general election spending indi¬ 
vidually, but the available expenditure data are for total spending and 
are not broken down by primary and general election spending. The 
use of only general election expenditure is appropriate. In order to 
determine whether the use of total expenditure distorted the logit 
regression estimates, those elections for which the incumbent faced 
primary election opposition were eliminated from the relevant sub¬ 
samples of observations. This was done because of the possibility that 
incumbents with primary competition faced serious challenges that 
could have induced large primary campaign expenditures. 20 The 
V329 logit regressions (4) and (5) were rerun on the reduced subsam¬ 
ple of 20 observations, and the V326 logit regressions (6)—(9) were 
rerun on the reduced subsample of 33 observations. These regression 
estimates were changed little from those presented in the text, 21 indi¬ 
cating that potential problems with regressions (4)-(9) resulting from 
the use of total expenditure data did not materialize. Consequently, 
the conclusions regarding the relative importance of self-interest and 
public interest in the determination of the voting on the expenditure 
ceilings are unchanged. 

19 Furthermore, the CQ326 ceiling had less of an impact on reelection probability 
than the CQ329 ceiling. The mean and standard deviation of the change in reelection 
probability under CQ326 were 0.015 and 0.035, respectively, whereas the mean and 
standard deviation under CQ329 were 0.028 and 0.045, respectively. Consequently, 
not only might a given change in reelection probability be valued less highly by the 
subsample of congressmen voting on CQ326 than by the subsample voting on CQ329, 
but also the mean change in probability was lower for the subsample voting on CQ326 
than the subsample voting on CQ329. Both factors may tend to cause private interest to 
have a relatively smaller estimated impact on the CQ326 voting than on the CQ329 
voting. 

” In general, however, this does not appear to have happened. Of the 12 general 
elections for which the incumbent faced primary opposition, incumbents won all the 
primaries with a mean vote share of ,72. 

1 In fact, the i-statistics for the coefficients of P329 - P325 in the V329 regressions 
were slightly reduced, but the coefficients were still significant at the .05 level on a one- 
tailed test; whereas the t-statistics for the coefficients of P326 - P325 and P326 - P329 in 
the V326 regressions were increased so that the coefficients were now significant at the 
■05 level on a one-tailed test. The actual regressions are available from the author on 
request. 
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V. Summary and Conclusion* 

The objective of this paper has been to determine whether voting by 
members of the U.S. House of Representatives on legislation limiting 
campaign expenditures can be best explained by the public-interest or 
economic theories of regulation. Specifically, voting on proposed 
amendments to the House version of the 1974 Federal Election Cam¬ 
paign Act bill prior to its passage was analyzed. Since the benefits and 
costs to the congressmen are so direct and unambiguous, this analysis 
represents a strong test of whether the public-interest or economic 
theories of regulation can best explain congressional voting on regu¬ 
latory bills. 

The benefit (cost) to the individual House member is defined as the 
increase (decrease) in probability of reelection implied by the expen¬ 
diture ceiling. These changes in reelection probabilities were cal¬ 
culated from a vote share regression equation estimated from a sam¬ 
ple of U.S. House incumbent elections in 1976. 

Logit regressions of voting by congressmen on proposed amend¬ 
ments (CQ326 and CQ329) concerned solely with changing the level 
of the expenditure ceiling were run on the amendment's implied 
change in the probability of the incumbent's reeleetion and the in¬ 
cumbent’s party affiliation. While the regressions provided evidence 
consistent with both the economic and public-interest theories of reg¬ 
ulation, congressional self-interest dominated the public interest in 
the determination of the voting. The estimated consistendy positive 
relationship between the implied change in probability of reelection 
and the probability of the incumbent’s voting for the amendment 
supports the hypothesis that the voting was driven by individual con¬ 
gressmen’s self-interest. On the other hand, the negative constant 
term can be interpreted as evidence supporting the public-interest 
theory of regulation. However, only a quite small implied increase in 
reelection probability (both in absolute terms and relative to the 
reelection probability in the absence of the amendment’s expenditure 
ceiling) was sufficient to induce the incumbent to be more likely to 
vote for than against the amendment. Finally, the party affiliation of 
the incumbent had no significant effect on his vote. This suggests that 
party concerns were of litde importance in determining congressional 
voting when the benefits and costs of the amendments were directly 
and exclusively felt by the individual congressmen. These results 
complement Abrams and Setde’s (1978) finding that congressional 
voting on those amendments to the 1974 FECA bill generating party- 
specific benefits and costs (i.e., amendments regulating the financing 
of presidential campaigns) was cm a partisan basis. 
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This paper examines the interaction between union wages and the 
international pattern of production and trade. If union goods are 
heterogeneous in labor intensity, the introduction of an active union 
in the domestic country causes only the least labor-intensive range of 
union goods to be produced there, with goods of greatest labor 
intensity produced abroad because of the relatively high cost of do¬ 
mestic union labor. A narrowing of the scope of domestic union 
production will eliminate relatively labor-intensive goods, leading a 
rent-maximizing union to raise its union premium. The implications 
of this union behavior for comparative statics results are considered. 


I. Introduction 

General equilibrium analysis of labor unions has taken place primar¬ 
ily within the context of closed- or small-open-economy versions of 
the two-sector neoclassical model. The early contributions to this liter¬ 
ature introduce an exogenous union wage premium in one sector and 
consider the implications for various properties of the general equi¬ 
librium (see, e.g., Johnson and Mieszkowski 1970; Jones 1971; Magee 
1971; Diewert 1974). Several recent papers have made endogenous 
the actions of the labor union in an effort to understand both the way 
unions respond to changing international conditions and the implka- 
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tions of muon behavior for broader comparative statics results. 1 What 
has not been considered formally is the effect of union activity on the 
international location of production and the pattern of international 
trade, that is, the simultaneous determination of the union wage and 
the set of products produced by union workers. 

By modeling the union as operating in an import-competing sector 
that produces a variety of heterogeneous goods, this paper explores the 
union’s impact on the pattern of trade and, at the same time, the 
effect on union wages of shifting international patterns of produc¬ 
tion. In particular, if goods within the union sector differ in the 
intensity with which they require union labor and if the (domestic) 
union wage premium is an important source of international cost 
differences, then only the least labor-intensive range of union sector 
goods will be produced domestically, with goods of greatest labor 
intensity being produced abroad as a result of the relatively high cost 
of domestic union labor. With goods arranged in order of increasing 
labor intensity, the identity of the “marginal" good—the good of 
highest labor intensity that is produced in the domestic union sec¬ 
tor—will be determined by the size of the domestic union wage pre¬ 
mium: it defines the scope of domestic production in the union sector, 
and hence the range of domestic export goods, as a function of union 
behavior. However, the scope of domestic union sector production 
will itself affect the sectoral labor intensity of production, on which 
the wage premium of a rent-maximizing union depends. 2 As such, 
when a union supplies labor for the production of heterogeneous 
goods, the union wage premium and the pattern of international 
trade will be determined simultaneously. 

The key ingredients of the formal model developed below to ex¬ 
plore the link between trade patterns and union behavior are that 
countries differ only with respect to their degree of union activity, 
that goods produced with union labor are heterogeneous in the inten¬ 
sity with which their production requires union labor, and that there 
is a single labor union setting a uniform wage across goods of the 
“union sector.” The first assumption is extreme but represents a mod¬ 
eling technique that has proved useful in highlighting the contribu¬ 
tions of such determinants as endowments and technology to the 

1 See, e.g., Oswald (1982) and Hill (1984). Grossman (1984) has focused on modeling 
the response of union wage demands to an increase in international competition when 
the union follows a seniority layoff and rehiring rule. His results suggest that such 
hiring rules amid be responsible for the empirical observation that sectoral wages often 
fail to fall in response to a reduction in the price of competing import goods. The 
results of this paper provide an alternative explanation for such wage behavior. 

$ The effect of labor intensity on the derived demand elasticity for labor is often cited 
as one explanation of greater union activity in more capital-intensive sectors. For em¬ 
pirical evidence on this relationship, see, e.g., Hirsch and Berger (1984). 
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pattern of international trade. 9 The second assumption approximates 
an industrywide union, such as the United Auto Workers or the 
United Steel Workers, whose members produce a variety of products. 
Finally, the assumption of a single union is made to keep the model 
simple. Extending the model to include many unions operating 
within the union sector is discussed in a note. 

The particular framework on which the analysis rests is a variant of 
the two-country continuum-of-goods Ricardian model of Dombusch, 
Fischer, and Samuelson (1977) and is similar in some respects to the 
model developed in Dixit and Grossman (1982). There are two final- 
goods sectors and a nontraded intermediate-good sector. In the 
union sector, firms combine unionized labor and the intermediate 
good in different proportions to produce the various goods of the 
sector. For concreteness, this sector might be thought of as the auto¬ 
mobile industry, the union as the United Auto Workers, and the 
various goods within the sector as the array of different models pro¬ 
duced. The goods in the nonunion sector are produced with combi¬ 
nations of nonunion labor and the intermediate good and, without 
loss of generality, can be aggregated into one nonunion good. The 
intermediate good is nontraded and is produced in the intermediate- 
good sector with nonunion labor alone. To neutralize any Ricardian 
basis for international trade, technological differences between coun¬ 
tries that would lead to comparative advantage are assumed to be 
absent: only the operation of a labor union in the domestic country 
distinguishes it from the foreign country. Finally, the domestic labor 
union is assumed to organize workers in the domestic union sector 
and to choose a single rent-maximizing wage at which its members 
will be hired to produce the heterogeneous goods of the sector. 

The relationship between the union wage premium and the pattern 
of production and trade that emerges from this model has important 
implications for the model’s comparative statics results. Since the 
aggregate labor intensity of the domestic union sector is increasing in 
the scope of domestic production, the wage premium of a rent- 
maximizing domestic labor union will rise in response to increased 
“intensity” of foreign competition. Consequently, the scope of domes¬ 
tic production takes on a significance of its own, and the effects of 
labor migration, demand shifts, and technological change will be al¬ 
tered according to their respective impacts on domestic production. 
In this way, union activity can alter in a systematic way the standard 
comparative statics results familiar from competitive Ricardian trade 
theory. 

3 For example, Ricardian trade theory singles out technological differences as the 
determinant of trade patterns, while Heckscher-Ohtin theory focuses on differences in 
relative factor endowments. 
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The remainder of the paper proceeds as follows. After the model is 
developed in Section II, Section III illustrates the effect of an active 
union on comparative statics results of the Ricardian model by consid¬ 
ering the union response to a decline in demand for union products, 
to a policy of directing research and development efforts into the 
union sector, and to international labor migration. Section IV con¬ 
cludes the paper. 

II. The Model 

The continuum-of-goods Ricardian model of Dombusch et al. (1977) 
yields strong comparative statics results concerning the effects of 
changes in technology, tastes, and national labor endowments on the 
terms of trade, relative and real wages, and the scope or production in 
each of the two countries of the model. A convenient graphical repre¬ 
sentation of the model is developed by the authors to provide a simple 
and intuitive method for determining the equilibrium and generating 
comparative statics results. The purpose of this section is to develop a 
pair of diagrams that together characterize the world trade equilib¬ 
rium in the presence of a rent-maximizing labor union in the domes¬ 
tic country. This is accomplished in two steps. First, the model is 
solved given an exogenous union wage premium. Then union behav¬ 
ior is explicitly considered, and the general equilibrium of the model 
is obtained. 


Exogenous Union Wage Premium 

Located in each of the two countries of the model are two final-goods 
sectors and an intermediate-good sector. The intermediate good, 
which will be called “capital” and which is nontraded by assumption, is 
used as an input into the production of final goods and is produced 
with nonunion labor alone according to a linear homogeneous tech¬ 
nology common to both countries. Define units of the capital good so 
that one unit of labor produces one unit of capital in either country. 
Let r and w, respectively, be the price of the capital good and the wage 
of nonunion labor at home, and define r* and w* similarly abroad, all 
measured in any common unit. Then perfect competition will ensure 
that 


r *= w; i* * w* (1) 

as long as the capital good is produced in both countries. 

The two traded-goods sectors employ labor and capital to produce 
final goods for consumption. Consider first the domestic economy. 
Goods of the nonunion sector are produced with combinations of 
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nonunion labor and capital according to linear homogeneous tech¬ 
nologies. Since die relative price of factors used in this sector is fixed 
by (1), competition keeps the relative prices of goods produced in the 
nonunion sector fixed as well. Under the assumption that subutifity 
over the nonunion goods is homothetic, a composite nonunion good 
can be defined whose production requires inputs of capital and 
nonunion labor in fixed proportions. This composite nonunion good 
will be called good 2. 

The union sector contains a continuum of goods indexed by z € [0, 
1] and produced under constant returns to scale by combining capital 
and union labor in fixed proportions. The assumption of Leontief 
technologies in the union sector is primarily for graphical conve¬ 
nience, though its implications for the endogenous determination of 
die union wage premium will be discussed below. 4 Goods in the union 
sector are indexed according to increasing labor intensity, and the 
ratio of labor to capital, while uniquely fixed for any good, is assumed 
to vary continuously between zero and infinity as z goes from zero to 
one. Finally, domestic preferences over the entire set of consumption 
goods z € {[0, 11, 2} are assumed to be Cobb-Douglas. 

Now consider the foreign country. As noted above, technology for 
producing the capital good is identical at home and abroad. Further, 
assume that the Cobb-Douglas preferences over the final goods are 
shared by both countries. Finally, to neutralize any Ricardian basis for 
trade, it is assumed that, in the production of final goods, an econo- 
mywide efficiency differential exists between the domestic and for¬ 
eign countries that may give rise to an absolute but not to a compara¬ 
tive advantage. That is, with l(z) and A(z) defined as labor and capital 
requirements, respectively, for unit production of good z in the do¬ 
mestic country and f*(z) and k*(z) defined analogously for the foreign 
country, it is assumed that 

3 ?: 33 se![o - M - * 

where e measures the efficiency differential between domestic and 
foreign production of final goods. A rise in e corresponds to an in¬ 
crease in the relative efficiency of domestic producers. 

In the absence of a domestic union wage premium, condition (2) 
implies that there will exist no basis for trade between the two coun- 

* What is important is not that the elasticity of factor substitution is zero for goods in 
the union sector, but that it is the same across goods. Note 6 contains a discussion of 
varying factor substitution elasticities. Note also that, at the sectoral level, factor sub¬ 
stitution will occur in response to changing relative factor prices, but it will be accom¬ 
plished by altering the mix of goods produced within the sector rather than by altering 
the mix of factors in the production of any good. 
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tries. A single equilibrium wage w will be earned by all domestic labor, 
while w* will be earned by labor abroad. With (1) and (2) given and 
with no transportation costs, relative wages at home and abroad must 
in equilibrium satisfy 


since otherwise one country would have a cost advantage in the pro¬ 
duction of all traded goods. Under equilibrium condition (3), neither 
country has a cost advantage in the production of any good and, as a 
result, the international pattern of production is completely arbitrary. 

The introduction of a domestic union wage premium provides the 
basis for trade between the two countries. A two-quadrant version of 
the Lcrner-Pearce diagram familiar from Heckscher-Ohlin trade the¬ 
ory can be used to illustrate the no-trade equilibrium and to show how 
the existence of a domestic union wage premium gives rise to interna¬ 
tional trade. With labor measured on the horizontal axis and capital 
measured on the vertical axis, the right and left quadrants of figure 1 
depict the unit isocost lines and unit value isoquants that obtain in the 
union sector and the nonunion sector, respectively, in the absence of a 
union wage premium. For graphical convenience, t is set to unity, 
implying that technologies are identical in the two countries. 

In the absence of a domestic union wage premium, and with t set to 
one, (3) implies that, in equilibrium, domestic and foreign wages will 
be identical. With this common wage normalized to unity, (1) implies 
that the unit isocost line pictured in each quadrant of figure 1 will be 
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shared by both countries and intersects each axis of die two-quadrant 
Lemer-Pearce diagram at one, reflecting the reciprocal of the unitary 
wage paid to labor in each sector and the reciprocal of the unitary 
price of capital. Cobb-Douglas preferences imply that each good will 
be demanded and thus produced somewhere in the free-trade equi¬ 
librium, so that final-goods prices will adjust to ensure that every unit 
value isoquant just touches the unit isocost line in its sector. Since the 
unit value isoquants and unit isocost lines are shared by the two coun¬ 
tries, zero-profit production of any good can occur in either country, 
and, as such, the international location for the production of any final 
good will be completely arbitrary. 

Finally, recall that, in the union sector, each good z is assumed to be 
produced with a unique ratio of labor to capital; this too is reflected in 
figure 1. In particular, the ratio (l/k)(z') for any z' £ [0, 1] can be read 
from the right-hand quadrant of figure 1 as the inverse of the slope of 
the ray from the origin through the point on the curve labeled ZZ that 
lies vertically above z'. To comply with the assumption that the labor- 
capital ratio is continuous and monotonically increasing in z over the 
interval z E [0, 1] with (l/k)( 0) = 0 and 1) = ®, it is sufficient that 
the ZZ curve, which associates with each z E [0, 1] a labor-capital ratio 
in the way described above, intersect the horizontal axis at one, be 
continuous over z E [0, 1], and be differentiable in (0, I) with a finite 
slope that is at each point less than the slope of the ray from the origin 
through that point. 

Now consider the introduction of an exogenous union wage pre¬ 
mium, p * (w — w)/<a, which raises above one the wage paid to labor 
employed in the domestic union sector, w. Since the domestic union 
wage is now higher than the unitary wage abroad, the domestic price 
of capital "must fall below one if the union sector is to continue to 
operate at all in the domestic country. This implies, with (1), that the 
domestic nonunion wage must fall. In fact, for any given union pre¬ 
mium, the level of the domestic nonunion wage will determine com¬ 
pletely the scope of domestic production and the pattern of interna¬ 
tional trade. This is depicted in figure 2. 

With the foreign wage still normalized to unity, the foreign unit 
isocost lines continue to intersect each axis of the Lemer-Pearce dia¬ 
gram at one. The domestic unit isocost line in the nonunion sector will 
now be given by a line such as ed in the left quadrant of figure 2, 
with vertical intercept 1/r, where 1/r > 1/r* = 1, and horizontal inter¬ 
cept l/w, where l/w * 1/r by (1). The domestic unit isocost line in the 
union sector will be given by the line dd in figure 2, with vertical inter¬ 
cept 1/r, horizontal intercept l/w, where l/w < 1 ho* * 1, and slope 
-[1/(1 - p)J. The exact positions of ed and dd for any union pre¬ 
mium will depend on the level of the domestic nonunion wage vi, 
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which can be determined once the demand side of the model is com¬ 
pleted. 

Finally, goods prices will eliminate profits in equilibrium. In the left 
quadrant of figure 2, the unit value isoquant will move out along a 
radial path from the origin until its vertex lies on the outermost unit 
isocost line, that of the domestic country. Accordingly, the nonunion 
good will be produced domestically. Similarly, each unit value iso¬ 
quant in the right quadrant will move along a radial path from the 
origin until its vertex lies on the convex hull of the two unit isocost 
lines. The good labeled z in figure 2 is the “marginal" good whose 
production can occur in either country in equilibrium, all goods z € 
[0, f) produced at home and all goods z£ (z, 1] produced abroad. 
Consequently, in the presence of a domestic labor union, the domestic 
country specializes in the production of the nonunion good and of the 
capital-intensive goods of the union sector. 

The real wage effects of the domestic labor union can also be read 
from figure 2. The foreign nominal wage w* is unchanged at its 
normalized value of unity. Prices of goods z £ [z, I] are also un¬ 
changed relative to w*. However, prices of goods z € {[0, i), 2} have 
fallen relative to w* as a result of the domestic union wage premium, 
as reflected in the outward radial shift of the unit value isoquants for 
these goods. Therefore, foreign labor gains in real terms from the 
unionization of the domestic labor force because the domestic union 
wage premium has provided the basis for trade between the two 
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countries and foreign labor enjoys the benefits of this trade. The real 
wage of those employed in the domestic union sector rises by an even 
greater amount since <■> increases relative to the foreign wage w*. 
However, the domestic labor union causes the wage of laborers em¬ 
ployed in the domestic nonunion sector to fall in real terms with 
respect to every good except 2 € {0, 2}, which by assumption use only 
nonunion labor in their production and whose prices therefore move 
in tandem with the domestic nonunion wage w. Hence, for any union 
wage premium and domestic nonunion wage, the Lemer-Pearce dia¬ 
gram of figure 2 can be used to determine the pattern of production 
and trade and the real returns to factors in the two countries. 

The next step is to determine the domestic nonunion wage w as a 
function of the exogenous union wage premium. To begin, the mar¬ 
ginal good £ can be defined implicitly by setting the production cost of 
£ equal in the two countries. With the foreign wage normalized to one 
and with r - w by (1), £ will be an implicit function of w and e and of 
the domestic union wage premium p as given by 


+ 


m + m 


= *(«£) + *(£)]. 


Expression (4) can be manipulated to yield 


( 4 ) 



f - (1 ~ P) 


( 5 ) 


Define/{•) “ (Vk)~ *(•)• Then (5) can be solved for i as 



( 6 ) 


Since by assumption the labor-capital ratio is continuous and mono- 
tonically increasing in 2 over the interval 2 € [0, 1] with (i!k)(0) - 0 
and (//*)( 1) = <»,/(•) will be continuous and monotonically increasing 
in its argument with /(0) = 0 and /(*>) * 1. As such, expression (6) 
implies that, for any 0 < p < 1 , w must satisfy e(l - p) < w £ e. If 
w * t, i — 0 by (6) and the domestic country produces only the 
nonunion good; on the other hand, w must be strictly greater than e(l 
- p) since if w = e(l - p), then by (6) £ * 1 and the domestic country 
would produce everything. Finally, inspection of (6) reveals that z is a 
continuous and decreasing function of w and p and an increasing 
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ioji 


. J~) (-) ( + \ 

£= H w, p, e ). 


(7) 


Next, define T(i(w, p, e), b( 2)) as the fraction of income spent (any¬ 
where) on those goods in which the home country has a comparative 
advantage, or 

rt(w, p, e) 

p, e), 6(2)) « 6(2) + ^ b(z)dz, (8) 

where 6(2) is the budget share allotted to consumption of good 2, and 
b(z)dz is the budget share allotted to consumption of union sector 
goods z G [z, z + dz]. Then 1 - T(-) is the fraction of income spent 
(anywhere) on goods produced abroad. The properties of z(w, p, e) 
noted in (7) and the nonnegativity of b(z) imply that r(£(u>, p, e), 6(2)) is 
nonincreasing in both ui and p and nondecreasing in e and 6(2). 

Finally, since preferences are Cobb-Douglas, the fraction of world 
income captured by the domestic labor union in the form of union 
rents is given by R{t(w, p, e), p) defined as 

R(«», f .,). p) = p • l' "" ’ Kz’ foQ - M. pW;) ]^z. (9) 


Then T(z(w, p, e), 6(2)) - R(z(w, p, e), p) is the fraction of world income 
received by the domestic labor force, net of union rents. 

With L and L* defined as the domestic and foreign labor force, 
respectively, and with w* normalized to one, domestic income Y will 
be given by 


Y = wL + R ^ w ' P» g )’ P) L * 

1 - R(l{w, p, e), p) 

The equilibrium value of the domestic nonunion wage, w, is then 
determined by the balanced trade condition 

(1 - r(z(u>, p, e), 6(2))]F = r(£(w, p, e), 6(2))L* 


or 


r(Z(a>, p, e), 6(2)) - fl(Z(fl, p, e), p) / L* \ 
1 - r(£(a), p, e), 6(2)) l L ) 


/ L* \ 

- B{iiw, p, e), p, 6(2), 

For any exogenous value of the union wage premium, p, the equilib¬ 
rium domestic nonunion wage can be found as the solution to (11). 
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Lemma. A unique w exists for any 0 < p < i, provided that t > 
<W - 6<2)]}(Z*/L). 

Proof. From their respective definitions, it follows immediately that, 
>r 0 < p < 1, 


T(K<D = e, p, e), 6(2)) = 6(2), 
R(2(w » e, p, e), p) = 0, 
r(£(a> = e( 1 - p), p, e), 6(2)) * 1, 
R(Ufi> = e(l - p), p, «), p) < 1. 

Consequently, 

= e, p, e), p, 6(2), = ■—- 2 ^ 


b|z(w * e(l - p), p, e), p, 6(2), -j-j * « 
In addition, [f( ) - /2(-)J can be rewritten as 


£T(-) - /?( )} = 6(2) + (1 - p) l 


film P. ') 


( 12 ) 


L*_ 

6(2) L 


< e, 


(13) 


6(z) 


U*) + (1 - P)*(z)J 


(14) 


Since z(w, p, e) is decreasing and continuous in it>, |T(-) - /?(•)] is a 
nonincreasing and continuous function of w, while [1 - T(-)] is a 
nondecreasing and continuous function of w, so that B(-) is a continu¬ 
ous function of w and 


dB[ i(ui, p, e), p, 6(2),— 

-i---0. (15) 

aw 

Conditions (13) and (15) and the continuity of B(-) ensure the exis¬ 
tence of a unique ii) that solves (11). Q.E.D. 

With equilibrium condition (11), the equilibrium domestic non¬ 
union wage can be determined as a function of p for fixed values of e, 
6(2), and L*/L. In particular, for p - 0, we know from (3) that w ~ e. 
Further, from their definitions in (8) and (14), T(-) and [r(-) - /?(*)] 
are nonincreasing in p so that 

dBU(w, p, e), p, 6(2), 

—i- LL s 0. 

dp 


( 16 ) 
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With (15) and (16), it follows from (II) that dwtdp < 0. Finally, for p = 
1, expression (6) implies that 2 = 0, which, according to (8) and (9), 
means that r(£(u>, p = 1, e), b( 2)) = b( 2) and R{z{w, p = 1, e), p = 1) = 
0. Consequently, (11) implies that when p = 1, w - {fe(2)/[ 1 - 
6(2)]}(Z.*/L) m It is assumed that < e. The relationship de¬ 
scribing w as a function of p implicit in condition (11) is summarized 
by the negatively sloped curve in figure 3. For any exogenous union 
premium, this curve gives the equilibrium value of the domestic 
nonunion wage. 

Modeling Union Behavior 

The potential importance of making endogenous the pattern of pro¬ 
duction and trade from the standpoint of determining the effects of 
foreign competition on union wage-setting behavior is brought out by 
noting that trade will have two opposing effects on the optimal union 
wage premium in this model. On the one hand, the union wage will 
be constrained by international trade through a higher elasticity of 
derived demand for union labor: this results from the international 
relocation of production that would occur at the margin in response 
to further increases in the union wage. On the other hand, the goods 
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whose production does remain at home will be those that use rela¬ 
tively unintensively the services of union labor, and the average labor 
intensity of production in the union sector will decline: this effect 
tends to reduce the elasticity of derived demand for union labor and 
consequently leads to a higher union wage. If the former effect is 
invariant with respect to the scope of domestic production, then 
through the latter effect, “intensification” of foreign competition, 
which manifests itself in a narrower scope of domestic production, 
will bring about a higher union wage premium since the increased 
competition from abroad weeds out precisely those plants whose rela¬ 
tively intensive use of union labor held down the domestic union wage 
premium. These results are derived formally as follows. 

Domestic union membership is taken as exogenously determined. 
Union members who do not get jobs in die union sector are assumed 
to find employment in the nonunion sector at the prevailing non¬ 
union wage. The union is assumed to choose o> to maximize the rents 
earned by its members, taking the domestic nonunion wage and level 
of world income (Y + L*) as fixed. 5 Domestic union rents can be 
written as 


n(u>) = (to 




rHw, p(ai). 


*) 

I(z)c(z; P(z; ia))dz 


* (u> - u>)D( to). 


(17) 


* As noted in Hill (1984), rent-maximizing behavior is a special case of the union 
objective function employed by McDonald and Solow (1981), in which union members 
are risk neutral. That the union ignores its effect on the domestic nonunion wage, the 
level of world income, and the price of the aggregate consumption bundle when setting 
its wage is an assumption that can be motivated by thinking of this union as one of many 
unions, no one of which is large enough to affect aggregate variables, but which to¬ 
gether have a significant impact. This motivation can be formalized by letting unit 
intervals be indexed by i, with i = 0, 1, 2,.... /, and then associating with each unit 
interval z € [i, < + 1] a union identical to the union studied above. The adding-up 
constraint on budget shares then becomes 2(. 0 P* 1 b{z)di * 1 - 6(2). so that as/ tends 
to infinity, the budget share associated with the goods of any single union sector goes to 
aero. With R and f as defined in the text now redefined as 

and t 

r - 6(2) + y [ b(z)dz, 

> i-0 ” 

where U is the marginal good of the tth union sector, it can then be thown with (11) that 
as / tends to infinity, dtfi/dw, tends to zero and, since Y + t*« (wL + L*)/( 1 - R), so too 
does d(Y + L*ydta,. It is then straightforward to show that the union behavior explored 
in the text e m erge s formally as I tends to infinity if each of the / unions chooses u ( 
independently to maximize the expected utility of its members. For a paper focusing on 
many small unions in a cominuum-of-goods general equilibrium model, see Staiger 
(1985). 
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where c{z\ P(z\ u>))dz is world demand for union sector goods z € [z, 
z + dz] and P(z; to) ■ o»/(z) + wk(z) is the price of good z. World 
income has been suppressed as an argument of demand since it is by 
assumption taken as given by the domestic union. The first-order 
condition for the union’s problem is 

SSfel-D(«) + (.--)■^-0. (18) 

Manipulation yields an expression for the optimal union premium 

A “ (19) 

where n • [“dD(w)/dw][WD(u))] is the elasticity of derived demand for 
domestic union labor with respect to u with the sign reversed. 

The derived demand elasticity tj can be broken into two compo¬ 
nents, one associated with changes in demand for each union good, 
with the scope of domestic production held constant, and the other 
associated with changes in the scope of domestic production itself. 
Explicit calculation of tj yields 

rUw, p, e) 

T) = ^ Mz; P, *), p)®(*‘. P)dz + a, (20) 

where 




Kz) 


P(z,w) l(z) + (1 - p)A(z) 
is the domestic union labor’s share of production costs for good z. 


X(z; z(w, p, e), p) 


6(z)0(z; p) 


I 


t(w. p. r) 


H z)6(z; p )dz 


is the share of derived demand for domestic union labor associated 
with good z, and 


di 

a - -<•> — X(f (w, p, e); z(w, p, e), p) 


is the elasticity of derived demand for domestic union labor associated 
with the international relocation of the production of marginal goods. 
According to (20), tj can be written as the elasticity of demand for 
domestic union labor associated with changes in the scope of domestic 
production, or, plus a weighted average of derived union labor de¬ 
mand elasticities across goods produced in the domestic union sector, 
which, because of the Cobb-Douglas demand and Leontief technol- 



*o$6 journal of political economy 

ogy assumptions, are given by the union cost share variable 0(z; p). 6 
The optimal union premium according to (19) is simply the inverse of 
this sum. 

Given t, (19) defines the equilibrium p as a function of w. A 
sufficient condition for p € [0, 1) to exist for a given w is that o- exceed 
one over the relevant range. Since (6) implies that di/du> — -/'(•)[(* - 
w)/(<a - e) 2 ], a will be greater than one over the relevant range pro¬ 
vided that /'(•) is sufficiently large, that is, provided that variation in 
labor-capital ratios across i in the relevant range is not too large. The 
second-order condition will also be met provided that /'(•), and thus 
<r, is sufficiently large at the optimum. This is assumed to be the case. 
Finally, it is assumed that the distribution of budget densities and/”(•) 
are such that cr is invariant with respect to changes in the scope of 
domestic production in the relevant range. It is then readily shown 
from (19) that, provided second-order conditions are met, 

£ > »• «*> 

The relationship describing p as a function of w is illustrated by the 
positively sloped curve in figure 3. For given values of w, this curve 
gives the value of p satisfying the rent-maximizing conditions of the 
union. The solution to the two equilibrium conditions of the model, 
equations (11) and (19), is illustrated in figure 3 as (u>, jS). Finally, once 
general equilibrium values have been determined for the union wage 
premium and the domestic nonunion wage in figure 3, figure 2 can be 
used to determine the real wages paid to those employed in the union 
and nonunion sectors at home and to labor abroad and the equilib¬ 
rium pattern of production and trade. The next section explores how 
the endogeneity of the union affects several comparative statics re¬ 
sults of the model. 

III. Comparative Statics 

The model developed in the previous section can be used to illustrate 
the effects of changes in the international environment on union 
behavior. This section explores the union response to three events: a 
shift in consumer preferences toward nonunion goods, the imposi- 


8 More generally, the elasticities of factor substitution and product demand will also 
enter into the determination of the derived demand elasticity for union workers. How¬ 
ever, empirical evidence of uniformly low substitution elasticities between union and 
nonunion inputs can be found in Freeman and Medoff (1981, 1982), and it seems 
natural to focus on variations in cost shares as the main element of heterogeneity 
among goods served by the union. Limited variation in factor substitution or demand 
elasticities across goods would complicate but not alter the conclusions of this paper. 
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don of a domestic targeting program aimed at union sector goods, 
and the international migration of nonunion labor. 

Demand Shifts 

Lawrence and Lawrence (1985) provide an explanation for the rising 
union wage premium in the United States over the period 1970-84, 
which is based on a prediction of rising union wage differentials in 
response to long-run declines in demand growth: the decline in de¬ 
mand growth in the union sector reduces the subsdtudon possibilities 
between capital and labor, leading to a less elastic derived demand for 
union labor and a greater wage premium. The model of Section II 
yields a similar relationship between declining union sector demand 
and rising wage differentials, but for a very different reason. 

Consider an increase in 5(2), the proportion of income spent on the 
nonunion good, accompanied by a proportional decrease in the 
budget shares of all union sector goods z £ [0,1], so that X(z; z(w, p, e), 
p) is unchanged for any w, p, and e and the budget shares over all final 
goods still sum to one. The decline in union sector demand will have 
no effect on the equilibrium relationship between p and w given by 
condition (19) since 5(2) does not enter (19) directly (see the definition 
of Tj given in [20]). Combinations of j> and w that satisfy (19) are 
depicted by the positively sloped curve in figure 4. However, 5(2) does 
enter into equilibrium condition (11). For any p > 0, the proportional 
drop in the budget shares of all union sector goods z £ [0, 1] and the 
accompanying increase in 5(2) will lead to a domestic trade surplus 
that, according to (11), requires a rise in w to restore equilibrium. If 
p = 1, then (11) implies that 

- _ 5(2) L* _ . 

® “ 1 - 5(2) L Wmin 

so that w mi „ must rise with an increase in 5(2). If p * 0, then w * e and 
the location of production is arbitrary, so that changes in budget 
shares have no effect on relative wages. This is summarized by the 
upward shift in the negatively sloped curve shown in figure 4, which 
depicts the relationship between w and p given in (11). 

At the original po, the domestic nonunion wage is bid up relative to 
the wage of foreign labor because of the proportional shift in prefer¬ 
ences away from goods of the union sector and toward good 2: this 
makes domestic union sector production less competitive relative to 
production abroad, and the production of a marginal range of do¬ 
mestic union goods is lost to foreigners. Since the domestic plants that 
°Jose are the most labor intensive of the domestic union sector, the 
sectoral labor intensity of the domestic union sector declines, indue- 
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ing the rent-maximizing union premium to rise. The new equilibrium 
is illustrated in figure 4 by (w if pi), where the union premium has 
risen as a result of the declining demand for union sector goods. 

Fundamentally, it is not the decline in union sector demand per se 
but rather the loss of marginal goods associated with it that leads to a 
rise in the domestic union wage premium. The loss of marginal goods 
is brought about by an increase in the domestic nonunion wage result¬ 
ing from the greater demand for domestic nonunion workers. As 
such, the model associates rising union wage premiums with falling 
union sector demand only when, as in the proportional case consid¬ 
ered here, the shift in demand away from the goods of the union 
sector results in greater demand for the services of domestic non¬ 
union labor. 

Targeting 

As Krugman (1987) has noted, the case for industrial targeting stands 
or falls with the ability to identify sectors that “ought” to be targeted, 
where targeting is understood to imply a policy of affecting the sec¬ 
toral pattern of investment rather than its aggregate level. Since the 
choice of the targeted sector will have implications for the scope of 
domestic union production in the model of Section II, it will also 
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affect the number of union members who earn the union wage and 
the level of the union wage premium itself. This in turn can provide a 
reason to alter the sectoral mix of investment through a policy of 
industrial targeting. 

As an illustration of this point, consider the choice between allocat¬ 
ing a given amount of R 8c D expenditures to either the union or the 
nonunion sector of the domestic country, where the direct (cost¬ 
saving) effect of the R Sc D results in an increase in e in the targeted 
sector. Suppose that the change in e alone would lead to cost reduc¬ 
tions for the domestically produced goods of the sector, which imply 
an equivalent increase in utility regardless of the sector chosen for 
targeting. Thus, from the standpoint of the direct effect of R & D, 
there is no basis on which to favor one sector over the other. 

There are, however, two attributes that distinguish the union from 
the nonunion sector with regard to targeting in this model. The first, 
which has nothing to do with the operation of the union per se, is that 
the union sector happens to include the marginal good, on whose 
relative production efficiency the relationship between domestic and 
foreign wages depends. The second is the presence of the union in 
the union sector, which ensures that the sector chosen for targeting 
will have implications both for the number of union members earning 
the union wage and for the level of the union wage premium. 

Specifically, if the nonunion sector were targeted, all benefits would 
be captured in the direct (cost-saving) effect: the increase in the do¬ 
mestic efficiency of the nonunion sector and the resulting drop in 
P( 2) would be the only benefit of the program since endogenous 
variables of the model would be unaffected. This is easily seen by 
noting that e enters both equilibrium conditions (11) and (19) only 
through £(w, p, e) and, hence, only insofar as it captures the interna¬ 
tional technology differences in production of union sector goods. As 
such, the domestic utility benefits of R & D applied to the nonunion 
sector are captured completely by the resulting price reduction for 
good 2. 

Not so for a policy of targeting the domestic union sector that, in 
altering the efficiency of production of the marginal union good, 
affects the equilibrium values of w and p. Specifically, consider first 
the impact of an increase in e on equilibrium condition (11). The 
efficiency parameter e enters (11) only through its effect on i(w, p, e) 
as given in expression (6). Thus, for any 0 < p < 1, (6) implies that an 
increase in e would require an equivalent percentage increase in w to 
leave £(«/, p, e), and hence £(•), unchanged. But this would leave w 
greater than B(-), and hence for (11) to be satisfied, must rise by less 
than the percentage increase in e. For p = 1, w is unaffected by the 
change in e since Hi * ui min ■ {f»(2)/[l - b(2)}}(L*IL), while for p = 0, 
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(a) 




(3) implies that w = e and thus w rises by the full increase in e. This 
information is reflected in the upward shift of the negatively sloped 
curve in figure 5o. Next consider the effect of increasing e on equilib¬ 
rium condition (19). With p held constant, (6) implies that w must rise 
by the same percentage as e to leave i, and hence i/q, unchanged. 
Thus the positively sloped curve in figure 5a shifts up by the same 
percentage as the change in e. 

The resulting equilibrium (S>i» Pi) is given by figure 5 a. At the 
original Po> the scope of domestic union sector production would in¬ 
crease as a result of the targeting program since according to (11) the 
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domestic nonunion wage does not rise to fully offset the cost savings 
resulting from the technological advance. Since the additional plants 
added to domestic union sector production are more labor intensive 
than existing domestic production, the sectoral labor intensity of the 
domestic union sector rises, causing the rent-maximizing union pre¬ 
mium to fall. The final equilibrium is given by (© 1 , Pi), where the 
domestic nonunion wage has increased and the domestic union pre¬ 
mium has fallen as a result of the union sector targeting. 

The real wage implications of the change in w and p induced by 
union sector targeting are illustrated in figure 5b. 7 The unit isocost 
lines of the foreign country are given by the solid lines. Under the 
assumption that technologies are originally identical, that is, that e 0 - 
1, the original unit isocost lines of the domestic country are given by 
the dashed lines. With R & D targeted to the domestic union sector, e 
will rise above one in that sector. With p held fixed, figure 5 a shows 
that w (and thus r) rises, but by less than e, to w'. The dotted lines in 
the two quadrants of figure 5b reflect this new equilibrium value of w, 
where p has been held fixed. The rise in w and r shifts inward the 
domestic unit isocost line in the (nontargeted) nonunion sector, as 
depicted in the left-hand quadrant. In the right-hand quadrant, the 
foreign country’s unit value isoquant will be the domestic country's e 
value isoquant since the two countries no longer share technologies 
for production of the union goods. As such, the domestic union sec¬ 
tor's e isocost line is depicted by the dotted line in the right-hand 
quadrant. At the original p 0 , it has shifted out since w (and thus r) has 
fallen relative to e. Note that the outward shift in the domestic union 
sector e isocost line is associated with a drop in the equilibrium price 
of goods produced in the domestic union sector since the domestic e 
value isoquants (which at unchanged prices would not move with the 
rise in e) must shift radially out from the origin to maintain zero 
profits in the domestic union sector. 

Even before allowing p to respond, it is apparent from figure 5a 
and b that an argument can be made for targeting the union sector in 
this model. From figure 5a, w increases and, with p held constant, so 
too does o> (by the same percentage). From figure 5b, prices of all 
union sector goods produced at home have fallen, while prices of all 
union sector goods produced abroad remain unchanged. Finally, the 
price of the domestically produced nonunion good rises by the same 
percentage as w and < 0 . Thus domestic wages (union and nonunion) 
increase not only with respect to the prices of domestically produced 
union sector goods (the direct effect) but also with respect to the 

7 Hie ZZ curve is suppressed from figs. 5 b and 6b since it remains unchanged 
throughout. 
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prices of all goods produced abroad (the terms of trade effect). 8 
Moreover, there is a beneficial union employment effect from target¬ 
ing the union sector as well. To see this, note that the (direct plus 
indirect) demand for labor to produce any union good z' € [0, Zq] 
relative to the (direct phis indirect) demand for labor to produce the 
nonunion good 2 is given by 

c(zW)+ *(*')] 

c(2) — [1(2) + *(2)] 

t 

if the nonunion sector is targeted and by 

c( z')~ [*<*') + *(z')] 
c(2)[J(2) + W) 

if instead the union sector is targeted. In either case, Cobb-Douglas 
preferences imply that this ratio reduces to 

b(z') f Hz') + k(z') ] 

b{ 2) 1[1/(1 - p)]/(z') + A(z')i‘ 

Thus, with p held fixed, changes in e (in either sector) have no impact 
on relative labor demands between the nonunion good and any union 
good z' € [0, Zo]. This implies that changes in sectoral employment are 
then determined by changes in the scope of domestic union sector 
production alone. Consequently, with p held fixed, targeting the 
union sector, by increasing the scope of domestic union sector pro¬ 
duction, increases the number of union members who are paid the 
union wage o>. In contrast, targeting the nonunion sector leads only to 
a direct real wage effect with respect to the nonunion good since it 
leaves the (double factoral) terms of trade unaltered and since, with 
the scope of domestic union production unchanged, it leaves the 
number of union members earning <■> unaltered as well. 

A further case for targeting the union sector comes from the addi¬ 
tional effect of the targeting policy on the welfare of domestic non¬ 
union workers once the union is allowed to react. 9 As shown in figure 
5a, p falls, inducing w to rise further from w', until the new equilib¬ 
rium Pi and Wi are reached. The dash-dot lines in figure 5b reflect this 
final equilibrium The additional welfare gains to domestic nonunion 
workers that arise from the union response come in the form of 

8 An argument for targeting "marginal" goodi because of this terms of trade effect 
has been made recently by Itoh and Kiyono (1987). 

9 The policy of targeting analyzed here assumes that die union takes as given the 
government R & D decision when it sets its wage demands. 
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lower-priced union sector goods produced domestically and lower- 
priced imports from abroad. As a result of the fall in p and the 
accompanying increase in w, the domestic nonunion wage rises with 
respect to the prices of all goods z £ (0, 1]: the relative wage increase 
makes foreign-produced goods more affordable at home, while the 
drop in p lowers the price of domestically produced union sector 
goods relative to the domestic nonunion wage. Finally, note that since 
the union does not internalize its impact on w when choosing p, there 
is no guarantee that the union response to targeting will increase 
further the utility of domestic union members. 10 However, if dw/dp as 
described by (11) is sufficiently close to zero around po, as it would be, 
for example, if the union sector were sufficiently small, then domestic 
union members must benefit as well from the union’s response to the 
targeting program. 


Labor Migration 

International labor migration can be represented in the model as a 
change in L*IL. In the competitive model explored in Dornbusch et 
al. (1977), labor migration from the low-wage to the high-wage coun¬ 
try would reallocate the world stock of labor toward the country 
whose marginal-good technology is most efficient. This serves to ex¬ 
pand the world production possibilities frontier and makes labor in 
the low-wage country better off, though the (original) inhabitants of 
the high-wage country suffer a welfare decline. Of course, the expan¬ 
sion of world production possibilities ensures that the gainers could 
compensate the losers. Findlay (1982) has argued in the context of a 
Ricardian model that, with regard to several well-known notions of 
distributive justice, free trade cum migration is “just" in that it at once 
expands the world production possibilities and brings about a more 
equal international distribution of income. 

In the model of Section II, the consequences of labor migration in 
response to international wage differentials can be quite different. 
First, the union can cause the less efficient country to have the high 
nonunion wage, so that (nonunion) labor migration in the direction of 
higher wages contracts the world production possibilities frontier. 
Moreover, in response to this migration abroad, the scope of domestic 
production contracts, and the union wage premium therefore rises, 
offsetting the real gains that would otherwise accrue to domestic 

10 The possibility that union members could lose from the union’s wage response 
reflects the fact that, in a world of many small unions in which all unions taken together 
have a significant impact on w, the effect of union behavior on w is external to any 
tingle union’s wage-setting decision. See it. 5 on the generalisation of this singie-union 
modei to a model with many unions. 
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nonunion workers and reducing the amount by which migration 
closes the international nonunion wage discrepancy. 

This is illustrated in figure 6a and b. Figure 6a illustrates the initial 
determination of p<> and tZ) 0 . where, as drawn, e > 1 > u> 0 ; that is, the 
home country is more efficient than the foreign country in the pro* 
ductton of all final goods, but the domestic union wage premium has 
reduced the domestic nonunion wage below the unitary wage of labor 
abroad. The solid lines in figure 6 b reflect foreign unit isocost lines, 
while the dashed lines represent the home country’s initial e isocost 
lutes. 

With the initial domestic nonunion wage lower than the wage 
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abroad, migration will occur in the direction of the foreign country. 
As such, L*IL increases, and labor migrates toward the technologically 
inferior country. This will have no effect on the positively sloped 
curve in figure 6a, as can be seen by noting that L*IL does not enter 
directly into equilibrium condition (19). However, equilibrium condi¬ 
tion (11) implies that the negatively sloped curve inr figure 6o will shift 
upward: for p - 1, the percentage increase in the domestic wage will 
equal the percentage increase in L*IL, while for p = 0, w equals e and 
is unaffected by the relative changes in the size of the domestic labor 
force. 

At the initial p 0 , the increase in the domestic nonunion wage that 
results from the exodus of nonunion labor will contract the scope of 
union sector production in the home country. Since the most labor- 
intensive activities are the first to go, the sectoral labor intensity of 
production in the domestic union sector declines, inducing a rise in p. 
The final equilibrium is given by (pi, w\) in figure 6a, where both the 
union wage premium and the domestic nonunion wage have in¬ 
creased as a result of the migration. 

The real wage effects of this migration are contained in figure 6 b. 
The dash-dot lines reflect the e isocost lines for the home country 
after migration has occurred but before the union responds. The 
dotted lines represent the final e isocost lines for the home country 
and reflect the fact that both w and p have increased. The implications 
of the union response to migration for the utility of foreign workers 
are ambiguous: for a given domestic nonunion wage the higher p 
makes domestic goods more expensive abroad, while the drop in the 
domestic nonunion wage induced by the increase in p makes them 
cheaper. Overall, however, foreign residents must be hurt by the 
influx of labor since both w and p rise. The effect of the union re¬ 
sponse to migration on the utility of domestic nonunion workers is 
unambiguously negative since the rise in p diminishes the purchasing 
power of the domestic nonunion wage with respect to domestically 
produced union sector goods, and the induced fall in w makes goods 
abroad more expensive as well. The overall effect of the migration on 
domestic nonunion residents is ambiguous, however, because of the 
rise in both w and p. If domestically produced union goods enter with 
sufficient weight in utility, the (remaining) domestic nonunion labor 
force will be made worse off by the exodus of domestic labor. 

IV. Conclusion 

Competitive trade theory suggests that freely working market forces 
in a world economy will determine optimally the international loca¬ 
tion of production and that a country engaged in trade need have no 
additional concerns oyer the identity of sectors operating within its 
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borders. While recent literature has questioned this result and points 
toward the potential national benefits of the domestic location of sec- 
tore with certain attributes, the focus has been primarily on product 
markets . 11 Moreover, there is nothing inherently perverse about the 
free-trade allocation of production across countries in this literature: 
there simply exist certain sectors that all countries would rather have 
operating within their borders. 

This paper has focused on the existence of an optimizing labor 
union and has explored the interaction between union wage demands 
and the international pattern of production and trade. In the model 
considered, the scope of domestic production takes on a special wel¬ 
fare significance of its own since it determines in a systematic way the 
characteristics of the set of firms served by the trade union. A broader 
scope of production at home is associated with a lower domestic union 
wage premium, while an “intensification” of foreign competition 
leads to a higher domestic union premium. 

As illustrated in Section III, this relationship implies several conclu¬ 
sions concerning the effects of changes in the international environ¬ 
ment that lead to shifts in the international pattern of production. 
First, a shift in preferences away from union sector goods and into the 
nonunion goods of the domestic country, in driving up the domestic 
nonunion wage relative to the wage abroad, will diminish the scope of 
domestic union sector production and hence reduce the sectoral labor 
intensity of production: this leads to a rise in the rent-maximizing 
domestic union wage premium. As such, the model predicts a rising 
union premium in the face of declining demand for union products. 
Second, directing domestic R & D efforts toward the union sector will 
expand the scope of union production and reduce the union pre¬ 
mium, while R Sc D in the nonunion sector leaves unchanged the 
scope of union sector production and, hence, union behavior: this 
raises the possibility of welfare gains from a program of union sector 
targeting. Finally, international labor migration can have very differ¬ 
ent effects when a labor union is present. In particular, the presence 
of a union in the technologically more advanced country can cause 
migration of nonunion labor to occur in the direction of the less 
advanced country. Moreover, as this migration takes place, the union 
wage premium is driven up, offsetting the welfare gains of migration 
to the remaining nonunion population in the domestic country. 

The inverse relationship between scope of production and the 
union premium that emerges from this model depends critically on 
the notion that international differences in union activity are an im¬ 
portant determinant of international cost differences. An extreme 
view has been adopted here, in that the presence or absence of a 

" See Krugroan (1986) for a recent review. 
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union is all that distinguishes countries in the model. The interaction 
of labor union activity with other determinants of trade patterns is 
clearly a direction for further research. 
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We analyze equilibrium in a labor market model wherein it takes 
time for the workers to contact firms. Workers, assumed identical, 
repeatedly sell their labor services all through their work lives, 
choosing their search intensity endogenously. Identical firms at¬ 
tempt to maximize their steady-state profit flow. We focus on the 
importance and consequences of balanced matching, in which work¬ 
ers are more likely to contact a larger firm. A unique equilibrium is 
shown to exist wherein all firms offer the same wage and select an 
employment level at which wage equals marginal product. The ef¬ 
fect of traditional labor market policies and empirical implications 
are discussed. 


I. Introduction 

The purpose of this paper is to construct and analyze a market mod 
wherein it takes time for the agents to contact each other. We considi 
a labor market model of this type with homogeneous workers at 
identical firms and specify conditions that guarantee the existence 
a unique equilibrium in which, among other things, there is a stric 
positive level of unemployment. As this equilibrium is relatively si: 
pie to characterize, the equilibrium consequences of changes in t 
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mar ket parameters can be investigated. This leads to new and empiri¬ 
cally relevant predictions when traditional labor market policies are 
considered. 

A brief survey of modern labor economics reveals that many stud¬ 
ies, perhaps the majority, ignore the market approach and concen¬ 
trate only on the supply side of the subject. A possible reason for this 
unsatisfactory state of affairs is that equilibrium in the standard com¬ 
petitive model does not have those characteristics that, rightly or 
wrongly, have become the focus of attention in labor economics. For 
example, to generate unemployment in the competitive framework 
some rigidity is typically assumed to hold (such as downward rigid 
wages). As the reasons given for the existence of such rigidities are 
usually not convincing, this approach is somewhat unsatisfactory. 

A different tack will be taken here. Unlike the model of perfect 
competition, it will be recognized at the outset that it takes time for 
workers to contact firms. Further, firms will be assumed to be cogni¬ 
zant of such difficulties when setting their wages. 

An element of the model to be presented is based on the theory of 
job search 1 and the related literature on consumer search. Although 
the vast majority of studies in this area consider only one side of the 
market, there are now several studies that analyze market equilibrium 
within a search framework (see, e.g., Axell 1977; Reinganum 1979; 
Burdett and Judd 1983). 2 The equilibrium search models (in homo¬ 
geneous agent contexts) usually consider a consumer good market in 
which each consumer wishes to minimize the expected cost of pur¬ 
chasing a unit of good. It has been shown that if (a) all firms face the 
same constant marginal cost of production and (&) the cost per search 
to each consumer is positive and bounded away from zero, a unique 
market equilibrium exists in which each firm offers the monopoly 
price. (Diamond [1971] first obtained this result.) Within the context 
of a labor market, conditions similar to those specified above generate 
a unique equilibrium with ail firms offering the monopsony wage. 
There are two main features of the model considered here that ren¬ 
der the equilibrium different from the monopsony wage equilibrium 
above. 

First, unlike the equilibrium search models (previously studied), the 
model used here is not a “one-shot” affair in which workers sell their 
labor services only once. Workers will be assumed to sell their labor 
services repeatedly (i.e., constantly engaged in search) throughout the 

1 For a survey, see Lippman and McCall (1976) and Mortensen (19866). See also 
Diamond (1984). 

Other studies of dispersed price equilibrium include Butters (1977), Salop and 
Stwliu (1977), Wikle and Schwartz (1979), MacMinn (1980), Chan and Leland (1982), 
Schwartz and Wilde (1982), Gal-Or (1984), and Rob (1985). 
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duration of their participation in the market. They will be taken to 
have an uncertain (random) duration in the market, and any worker 
who leaves the market is instantly replaced by another who is initially 
unemployed. Unemployed workers, subject to an increasing cost of 
search, can choose the rate (or intensity) at which firms (vacancies) are 
contacted. When a contact is made, a worker can either accept the 
offer or reject it If an offer of a firm is accepted, the worker (while 
employed at that firm) can further continue to search for other firms 
offering higher wages. 

The second main feature concerns the matching or the meeting 
technology. Typically, in job search models it is assumed that a worker 
is equally likely to contact any firm in the market. Thus, when there 
are n firms in the market (1 <«<“=), the probability that a worker 
contacts a given firm is 1/n. This will be termed “random matching.” 3 
This simple specification, routinely used with little justification, is sub¬ 
ject to criticism. For example, an unfortunate consequence of random 
matching is that a firm, by splitting itself into two, can increase its 
number of potential employees since 1/n is less than 2/(n + 1) and, 
thus, possibly increase its profits. In the present study, it will be as¬ 
sumed that a worker is more likely to contact larger firms. In particu¬ 
lar, the probability that a worker contacts a firm (given that a contact 
is made) equals the number employed by that firm divided by the total 
employed labor force. Balanced matching will be said to hold when 
this restriction is imposed. With balanced matching it is easier for a 
larger firm to contact workers than with random matching. This fact 
will be shown to have important implications. 

The behavior of workers is described in Section II, with firm behav¬ 
ior taken as given. Search intensity functions are derived that charac¬ 
terize the rate at which a worker contacts firms given his current 
status and the distribution of wage offers. These search intensity 
functions are then used in Section III, where firm behavior is ana¬ 
lyzed to ascertain the flow of workers among firms and from unem¬ 
ployment to employment. When balanced matching holds, it will be 
shown that a firm, maximizing its expected steady-state profit flow, 
chooses a wage and an employment level such that the flow of poten¬ 
tial new employees equals the flow of workers out of the firm. 

The results derived in Sections II and HI form a basis to establish 
the unique market equilibrium in Section IV. At this equilibrium all 
firms offer the same wage and choose an employment level such that 


3 Matching here refers to the meeting technology or the manner in which workers 
are assigned to (or contact) different firms in the market. In Mortensen (1978) and 
Jovammc (1979), matching is the partnership, and its quality, between a specific worker 
and a specific firm. Indeed, workers in our study are homogeneous in productivity. 
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the wage equals the real value of the marginal product of labor. It will 
be shown that this equilibrium possesses a strictly positive level of 
unemployment and a wage greater than the (implicit) competitive 
wage. In Section V, we analyze the effects of traditional policies (such 
as unemployment benefits or search subsidy) on the equilibrium wage 
and the unemployment level and also outline other empirically rele¬ 
vant predictions of the model. 

II. Worker Behavior 

Suppose that a large number of homogeneous workers participate in 
a labor market. Although the size of the labor force is assumed to be 
fixed, there is turnover. In particular, let 8A denote the probability 
that any given worker leaves the labor force in a small period of time 
h. A worker who leaves the market is replaced by another who is 
initially unemployed. 

At any moment in time a participating worker is either unemployed 
or employed at one of the firms in the market. Any worker not 
satisfied with his or her status may choose to look for a new one by 
contacting firms. Contacting firms, however, is costly and time con¬ 
suming. Suppose that a worker’s contact frequency, X, is the parame¬ 
ter of a Poisson process. The cost per instant of choosing contact 
frequency X is indicated by c(X), where c( ) is increasing, strictly con¬ 
vex, and differentiable and satisfies the following restrictions: 

c(0) = 0, (la) 

lim c'(k) — 0, lim c'(\) = » (lb) 

X—»o x-»* 

The quantity c(X) is a given feature of the market and is interpreted as 
the cost of investment in search necessary to generate a contact fre¬ 
quency X, regardless of the worker's status. 4 The contact frequency 
chosen by a particular worker will, of course, depend on his or her 
current status. 

Each worker is aware of facing two functions. The first of these 
functions, F(-), is a distribution function describing the wages offered 
by firms in the market; that is, F(u>) denotes the fraction of firms 
offering a wage no greater than w. Assume that F(-) has compact 
support on the positive real line. The other function known to a 

* A more general form of the search cost function may be used allowing for hs 
dependence on the worker’s status, i.e., the current wage. As pointed out later, the 
fnain results derived will hold if the cost is increasing in the current wage (i.e., in 
informal terms, the higher the wage received, the lesser is the effort that can be devoted 
to search). For aimplicity, we employ the form (1). 
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worker, <*(% F), characterizes what will be termed the “matching tech¬ 
nology’’ given the distribution of wages, F. In particular, a<w, F) indi¬ 
cates the probability that any worker is matched with a firm offering a 
wage greater than w, given that a contact is made. 5 Let H(w, F) * [1 - 
<*(•, F)] for all w. The distribution function //(•, F) is assumed to have 
the same support as F(-). 

When a{w, F) * 1 - F{w) for all u>, then random matching will be 
said to hold. A detailed specification of the matching technology to be 
studied will be given in the next section. For the present, we focus on 
the search behavior of workers who maximize expected lifetime in¬ 
come, for any given matching technology. 

Assume that a worker receives u> 0 per instant when unemployed. 
Let V(w, F, a) denote the maximum expected discounted lifetime 
income of a worker who currendy faces wage rate w(w ~ w 0 if unem¬ 
ployed). Given the model specified above, dynamic programming 
techniques can be used to establish the optimal contact frequency 
choice of any worker. Letting p denote the instantaneous rate of dis¬ 
count and h a small interval of time, we have 

V(w,F, a) ** max -J- — \u>h - r( \)k + Xk [ V{e, F, a)dH{e, F) 

xso 1 + pn [ iw 


1 (2) 

+ V{w, F, a)[\hH(u>, F) + (1 - M)] + o(h 2 ), 

where o(k 2 ) is the usual term indicating the probability of receiving 
more than one offer in time period h, multiplied by the conditional 
expected return. In the equation above, we have defined w - sup ft/.-, 
where fV is the support of the distribution F. With a short time 
interval of length h, the first terra on the right side of (2) indicates the 
income {present value evaluated at the beginning of this interval) net 
of search costs accumulated during the interval, given that the worker 
stays in the labor force. The second term is the expected future in¬ 
come from a better offer if one were to be contacted during this short 
interval. 6 The third term arises because the status of the worker is not 


* In general, the functions F and a are different. For example, suppose that half the 
firms offer a wage Wi and the other half offer w 3 (> a»i). Further, suppose that one- 
third of the employed receive u/| and the rest receive u»*. With balanced matching 
technology, 1 - a(ui, F) is 0 for w<w l ,n l /s for £w<w t , and is 1 for w a u> 5 . This is 
different from F. The two functions will be different unless the events "contact” and 
“offer greater than w" are independent. 

8 In writing the dynamic programming equation in the form given in (2), we have 
made use of the fact that V{w, a, F ) is increasing in w for each pair (a, F) because the 
matching probability function a and the wage distribution F faced by a worker do not 
depend on the current wage received by the worker. This intuitive property of the 
value function can be shown formally, through routine techniques. Consider the ay- 
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altered if, during this interval, either no offer arises or an offer con¬ 
tacted is no better than the current status. 7 Manipulating (2) and then 
letting h -* 0 yields 


V(w, F, a) 


max rf- \w - c(X) 

XssO 0 + p l 



[V(r, F, a) - V(w, F, a )]dH(e, F)j. 


(3) 


It is well known that compactness of the support of //(*, F) guarantees 
that (3) is well specified (see, e.g., DeGroot [1970] for details). It is 
easy to show that the value function above is increasing in w (see n. 6). 
An obvious consequence of (lb) and (3) is that a worker will choose a 
unique contact frequency X = X(a>, F, a) such that 


F, a )) = f [V(e, F, a) - V(w, F, a ))dH(e, F); 

Jw 


(4) 


that is, a worker chooses a contact frequency that equates the mar¬ 
ginal cost of increasing the frequency of contacting firms to the ex¬ 
pected return. Note that the Inada-type restrictions placed on c(-) 
guarantee a strictly positive choice of X if H(w, F) < 1. It is now 
possible to establish some important properties of the optimal search 
intensity. 

Proposition 1 . X(u», F, a) is continuous in w such that (n) \( w , F, a) is 
strictly decreasing in w whenever H(w, F) < 1, (b) k(w, F, at) = 0 if 
H(w, F) = 1, and (c) if H(w, F) = 0 whenever w <w and H(w, F) - 1 
whenever w as w (w > w 0 ), then there exists a function y(-) such that 
y{w - ui 0 ) ~ X(u>o, F, a) for all w > w 0 . 

Proof. It is easily noted from (3) that SV{w, F, a)/dw = l/{p + 8 + 


namic programming equation more detailed than (2): 

V(w, F, at) =* max ~ |u>A - c(K)h + kh P max[V(e, F, a), V(w, F. a)]dH(e. F) 
x*o 1 + P* l Jo 

+ (I - XA)V(w, F, a)J + o(h 3 ). 

It can be easily shown, through techniques of Blackwell (1965), that V{w, a, F), the value 
function {the fixed point of the contraction operator defined by the equation above), is 
continuous and increasing in w. This fact, in turn, leads to (2). If current wage in¬ 
fluenced either at or F (e.g., if some jobs blocked all chances of getting a better job), then 
such a monotonic property might be lost. This case is not considered here. All jobs are 
identical except for the wage. 

It is assumed without loss of generality that the value of leaving the labor force is 
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k(v), F, a)[l - H(w, F)]}. Using this and totally differentiating (4), we 
can show that 

3X(n/, F, a) _ _ 1 - H(w, F) _ 

Sw ~ c"(X(u>, F, a)){p + 6 + X(w, F, a)[l - H(w, F)]}' 

Claims a and b are immediate consequences of this relation. 

The hypotheses of claim c imply that every„job offer contacted has 
an associated wage w. For this case, equation (3) becomes 

V(u> 0 . F, a) ■ -—|-g max {w 0 - c(k) + X[V(w, F, a) - V(w Q , F, a)]} 


and 


^■F-D-pTT 

Letting X 0 denote the optimal search intensity for this case, we get, 
after manipulating these equations, 

V(w, F ,«) - V(a/ 0 , F, a) * ^ ^ [(w - w 0 ) + c(X 0 )]. 

Since this difference in the values of employment and unemployment 
states equals the marginal search cost c’(Xo) at the optimum, it follows 
that Xo * X(a/o, F, a) depends only on the wage difference w - w 0 , the 
parameters p and 8, and the cost function c(-). This proves claim c of 
the proposition. Q.E.D. 

Proposition 1 establishes that the labor market history of a worker 
can be described as follows. Each worker enters the market as an 
unemployed worker. A contact frequency is then chosen on the basis 
of the wage distribution and the unemployment income. Any offer 
received that yields a greater expected return than unemployment 
will be accepted. A strictly positive contact frequency will then be 
chosen as long as the worker is in the market and is not currently 
receiving the maximum wage. When employed, the higher the cur¬ 
rent wage received, the lower is the chosen intensity of search. 8 

HI. Finn Behavior 

The framework used and the behavior of workers will be as described 
in the previous section. Assume that the firm under consideration 
chooses its own wage and employment level on the basis of its expecta- 

* It can be seen (from the arguments used to establish the proposition) that proposi¬ 
tion 1 holds even when the search cost is allowed to depend on the worker's status (refer 
to n. 4) if the cost increases with the current wage. 
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tions about (a) the behavior of workers and (b) the wages chosen by 
other firms. Given the firm’s wage choice and the wages selected by 
other firms in the market, each worker will pick a desired contact 
frequency on the basis of the current status. This in turn specifies the 
flow of workers into and out of any particular firm, given the stock of 
unemployed and the initial number of workers employed in various 
firms. For the rest of the study it will be assumed that there are an 
infinite number of workers who are uniformly distributed on [0, I]. 
This harmless normalization implies that the phrases “the number of 
workers” and “the fraction of workers” can be used interchangeably. 

Let ( w , m) denote the wage rate and employment choice by the firm 
under consideration. Of course, a particular ( w , m) may not be feasible 
in that the inflow of potential employees is less than the outflow of 
workers. The net rate of flow of workers into the firm (which is inflow 
minus outflow) is denoted by 6(w, m, m 0 , a, G), where G(w) is the 
fraction of workers employed at a wage rate less than w, and mo is the 
fraction unemployed. 9 It follows that 

6(w, m, mo, a, G) = m 0 fik(wo, F, a) + fl [ k(e, F, a )dG(e) 

Jo (5) 

- k(w, F, a)ma(w, F) - 5m, 

where fl indicates the probability that any worker contacts the firm 
under consideration (given that the worker has made a contact). The 
matching technology a determines the value of fl. 

The first term on the right-hand side of (5) indicates the flow of 
unemployed workers who contact the firm, whereas the second term 
denotes the flow of employees from other firms who are willing to 
become employees at the firm. The third and fourth terms indicate 
the flow of workers out of the firm (to other firms and to nonpartici¬ 
pation, respectively). 

If B(w, m, mo, a, G) > 0, the flow of potential employees to the firm is 
strictly greater than the flow of workers out of the firm. In this case, 
the firm will not wish to employ all those who are willing to be em¬ 
ployed. If 6(w, m, m 0 , a, G) < 0, the flow of workers out of the firm is 
stricdy greater than the flow of potential employees. In this situation 
the employment level choice of the firm is not sustainable. 

To establish results in this section it will be assumed that balanced 
matching holds. In particular, we let 


P - 


m 

1 - mo’ 


( 6 ) 


8 We have written 6 as a function of only those arguments that are relevant for the 
discussion. 
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where 1- mo > 0 denotes the total number of employed workers in 
the market. This restriction obviously places structure on the match¬ 
ing technology function, a(*, F), defined in the previous section, and 
thus on the contact frequency functions described in proposition 1, 
The important consequence of restriction (6) can be quickly seen as 
(5) becomes 

0(u>, m, mo, a, G ) = mjx(u> 0 , F, a )|y~~| 

+ £’ “ X(a ’* F ' a)a{w ' F) ~ 8 ) 
w md(w, m n , a, G). (7) 

The term in the braces on the right side of (7), defined as 6(w, mo, a, 
G), is independent of the employment level choice m. This quantity 
may be interpreted as the flow of net hires (supply) per unit labor 
demand (given the market parameters). Balanced matching renders 
the net inflow of workers linear in the firm’s employment level choice. 
This fact has important consequences in the analysis to follow. 

Let tt(w. m) denote the profit flow of the firm given that choice (u\ 
m) is sustainable, that is, 

rt(w, m) = pf(m) - wm, (8) 

where p indicates the firm’s output price, and /(•) the production 
function faced by the firm. To keep things simple, assume that/is 
strictly concave, increasing, and differentiable such that 

lim f(m) = », lim f{m) = 0. (9) 

m-*0 m—*l 

The firm is assumed to choose a pair (w, m) to maximize its sustainable 
steady-state profit flow 10 (with m 0 , a, and G given exogenously), that 
is, 

max ir(w, m) subject to 0(u>, m, m 0 , a, G) st 0. (10) 

(w.m, 

A characterization of the unique solution to this problem is pre¬ 
sented in the following claim. 

Proposition 2. Given the balanced matching specification, and for 


10 A dynamic optimization problem for the firm can be formulated by taking into 
account the dynamics of employment level such as dm/St * 8(in, m , mo, a, G). Such a 
formulation is analytically more involved and may, in general, have oscillatory behav¬ 
ior. As a first step in the understanding of the problem, we focus on the simpler Steady- 
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any given mo (0 :S m 0 < 1) and G, there exists a unique solution {$>, ih) 
to (10) such that 


&(&, ih, w»o. a, G) = 0 


(ID 


and 


pf\A) - u> = 0. (12) 

Proof. It is straightforward to establish, from proposition 1 and 
equation (7), that balanced matching implies that the net inflow 0(u>, 
m, mo, a, G) is continuous and strictly increasing in w u if m > 0. This 
implies that, whenever 0 > 0, the firm can improve its profits by 
lowering the wage so that the desired stock of employed is just 
satisfied at zero net inflow of workers. 12 Thus the solution to (10) must 
satisfy 0(n>, m, mo, a, G) - mb(w, mo, a,G) = 0 if m > 0. Let u>(mo, a, G) 
denote the unique wage choice, obtained from this condition (see 
fig. 1). 

The firm’s isoprofit-flow curves are also illustrated in figure 1. The 
isoprofit-flow line is defined by the slope dwldm^-o - \pf'(m) - w]Jm. 
Inspection of this figure establishes the unique solution (u>, m) satisfy¬ 
ing (11) and (12) and, thus, the proposition. Q.E.D. 

With balanced matching, the net inflow of workers is proportional 
to the employment level and monotonic in the wage. A firm maximiz- 

11 The continuity of 0 needs some explanation. Notice that X(w, •) is continuous in w 
(proposition 1), and o(io, F) is right continuous in w since H(w, /")=!- a(w, F) is a 
distribution. The left continuity of 8 follows from the convention that a worker will not 
move to another firm offering the same wage. To see this, consider the third term 
°»the right side of (5). With e > 0, if the wage is vi - e, this term contributes - X(tt> - e, 
F, v)ma(w - e, F) to the quantity 8(w - *, •)■ As « -» 0, this term converges to - X(u\ 
F, u)ma(w, F) since the workers will, in this limit, not leave for other firms offering in. 
Continuity follows from the equality of right and left limits. 

Note from (5) that as w —* wq, 8 is negative since the inflow is zero. 



1058 JOURNAL OF POLITICAL ECONOMY 

ing its steady-state sustainable profit flow selects a unique wage corre¬ 
sponding to zero net inflow of workers, and this wage is independent 
of the employment level choice. The employment level is then chosen 
to equate the real value of marginal product with the wage. 

IV. Steady-State Equilibrium with Balanced 
Matching 

In this section we shall analyze the existence and properties of a 
steady-state equilibrium, under balanced matching, with an infinite 
number of homogeneous workers (the search behavior of each 
worker as described in Sec. II) and an infinite number of identical 
firms (the profit-maximizing behavior of each as in Sec. Ill) uni¬ 
formly distributed on the interval [0, 1]. To suit the setup of a con¬ 
tinuum of firms, we shall define balanced matching to hold if the 
probability a(u>, F) that a worker is matched (conditional on receiving 
an offer) with any one of the firms offering wages greater than w 
equals the ratio of the total number employed at these firms (at wages 
in excess of w) to the total number employed in the market. In other 
words, with G(w) representing the fraction employed at wage iv or less 
and m 0 (0 s m 0 < 1) denoting the fraction unemployed, balanced 
matching holds if 

1 - «(w, F) = - PfeL . (13) 

1 — m 0 

An equilibrium with balanced matching is said to exist if there is a 
distribution of wage offers F* and an unemployment level m* such 
that the following conditions A-C are satisfied. Given m$, the bal¬ 
anced matching specification a(w, F* ), and G, the individual behavior 
of all firms taken together generates a distribution of wage choices 
and a corresponding labor demand function. (Let m(w) denote the 
total labor demand of all firms with wage choice w.) The first equilib¬ 
rium condition is that the distribution of the wage choices so gener¬ 
ated is F*. ls In other words, (A) the distribution, f*. of wage offers of 
firms is such that there is zero net inflow of workers to any firm; (B) 
G(w) “ fo m(e)dF*(e ) for all w in the support of F*; and (C) the total 
labor demand m* ■ /S’ m(iv)dF*(w) is such that 1 - m* - m*. 

Thus at an equilibrium each firm has zero flow of net hires and 
workers are paid the real value of their marginal product. In addi¬ 
tion, the labor demand for any set of firms is just satisfied, and finally, 
the number unemployed equals the number of labor market partid- 


15 The distribution F* describes a Nash equilibrium wage choice of firms. 
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pants minus the number employed. A consequence of this definition 
of equilibrium can now be stated. 

Proposition 3. If an equilibrium exists with balanced matching, all 
firms will offer the same wage and thus demand the same number of 
employees. 

Proof. Suppose that an equilibrium (F*, mft) exists. Considering the 
set of firms offering wages in excess of w and equating to zero the net 
inflow of workers to this set, we have 


m*a(w, F*)k(w 0 , a ,F*) + a (a/, F*)( k(e, o, F*)m(e)dF*(e) 

Jo 

- 8 P m(e)dF*(e) - 0. 

Jw 


(H) 


Observing from (13) and condition B that fZ m(e)dF*(e) = (1 - 
m*)a(w, F*) and recognizing that, at an equilibrium, the flow of un¬ 
employed workers out of the market, equaling 8( 1 - m %), must be the 
same as the flow of unemployed workers obtaining jobs, equaling 
m*\(u> 0 , a, F*), it follows from (14) that 


k(e, a, F*)m(e)dF*(e) - 0 (15) 

Jo 

for all w in the support of F*. However, it follows from proposition 1 
that (15) can be satisfied only if k(w, a, F*) = 0 for all relevant w. In 
other words, all firms must offer the same wage. Q.E.D. 

The fact that the equilibrium wage distribution is degenerate may 
be understood as follows. Since the problem of an individual firm 
yields a unique wage choice and because all firms are identical (and 
nonatomic), facing the same net hires function (8), all choose the same 
wage in equilibrium. For the wage dispersion to be a possibility, there 
must be a collection of (w, m) pairs, in the individual firm’s problem, 
such that all pairs yield the same profit. In other words (referring to 
fig. 1), the locus of ( w , m) pairs that result in zero net hires (for all F) 
must intersect the constant profit curve (corresponding to the max¬ 
imum) at more than one point. This is not the case with balanced 
matching. 14 

In what follows, we proceed with the analysis and establish that 
there exists a unique equilibrium. Since each firm can, and does, 

14 This view of the proposition (pointed out by the referee) suggests that the specified 
balanced matching is not an isolated instance that obtains a degenerate equilibrium 
wage distribution. For example, in (6), if £ * km* {k being a normalizing constant), then 
# is linear in m. It can be shown that the locus of (», m) pairs yielding zero net hiring is 
convex, and consequently there is a unique choice of (tv, m) for the firm’s problem in 
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Fic. 2 

select the wage so as to have zero net hires in steady state (given m 0 , a, 
and F) and because d§/<hn 0 > 0 (which is obvious from inspection of 
[7]), it follows that the wage chosen is a decreasing function of m () (see 
fig. 2). Thus, by considering different levels of unemployment m, () we 
can construct for each firm a locus of the possible pairs of wage and 
employment level desired. Since we need to consider only the case of 
all firms offering the same wage, it follows that the locus of the pairs 
of wage and total employment level (desired by all firms) can be 
obtained. Let m(w) represent the total employment level desired, 
when all firms offer wage w. To sustain this level in steady state, the 
total demand (flow) of new hires equals 8wt(u>). which is shown in 
figure 3. Note that the Inada-type conditions imposed by (9) guaran¬ 
tee that m'(w) < 0 and hm{w) —► » as w —* 0. In addition, u» " 1 (m) -» » as 
m -* 0 . 

Now, given that all firms offer the same wage, only unemployed 



Fic. 3 
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workers search. Let moy(w - w 0 ) denote a family of supply of new 
hires, generated by different values of mo (see fig. 3). Only one of 
these will satisfy the equilibrium condition C, thus leading to the 
following proposition. 

Proposition 4. There exists a unique equilibrium with balanced 
matching in which the level of unemployment, m$, can be written as 


* = 8 
1710 y(w* - wo) + 8’ 


(16) 


where w* is the wage chosen by all firms. Further, the equilibrium 
wage chosen is strictly greater than the competitive equilibrium wage. 

Proof. It is readily seen from figure 3 that there is a unique wage, 
w *, that equates the supply of and demand for new hires. At this wage 
m* ** 1 - m(w*) ~ 1 — m*, thus satisfying the equilibrium conditions. 
Letting the supply of new hires, m*y(w* - u/ 0 ), equal the demand, 8(1 
- mo), establishes the first part of the claim. 

To prove the second part, note that under perfect competition, 
when c(X) = 0 for all X (i.e., when information about wages is costless), 
the market clears instantaneously, and there is no unemployment, 
that is, m* - 0. For this case, the supply of new hires is 8 if the market 
wage is at least as large as uio, and zero otherwise. Thus the competi¬ 
tive wage denoted by w c in figure 3 is lower than u>*. Q.E.D. 


V. Policy Analysis and Empirical Implications 

As the equilibrium described in the previous section is relatively sim¬ 
ple, it appears possible to inquire if there are predictable conse¬ 
quences when there is a change in one of the market parameters. 
Below, the equilibrium consequences of such changes will be inves¬ 
tigated as well as the implications of traditional labor market policies. 

To illustrate policy analysis, we first consider the resulting equi¬ 
librium when there is a change in unemployment insurance (UI) pay¬ 
ments. These payments directly contribute to the income (w 0 ) while 
unemployed. Using the equilibrium conditions C and (16), it is readily 
seen that 15 

dw* _ __ [1 - m(w*)]y'(w* - w 0 ) _ 

dw 0 [1 - m(itf*)]y(w* - Wo) — m’(w*)[y(ui* - w 0 ) + 8]’ 

which establishes that 0 < dw*/dwo < 1 since »»'(•) < 0. Furthermore, 
dm$ = ~hy'(w* - w 0 )[(dw*ldw 0 ) - 13 Q (l7b) 

dw 0 [y(w* - wo) + 8J* 

‘*Thete two conditions yield [1 - m(ur*)fr(w* - w 0 ) - Mw*>- Taking derivatives 
on both tides and rearranging, we obtain (17). 
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Figure 4 illustrates the consequences of a discrete increase in the U1 
payments from w 0 to w' 0 . Before this (unanticipated) increase, the 
equilibrium wage and unemployment level are shown as w* and m*, 
respectively. With the number of unemployed held constant at m*, the 
increase in UI payments shifts the supply function of new hires to 
mjjy (w — «>o). At the old equilibrium wage, w*, the demand for new 
hires exceeds the supply. Thus increases in both wage and unemploy¬ 
ment level are required to establish the new equilibrium at w*' and 
m$’. (This relationship is formally indicated by [IS].) 

Consider now the consequence of a marginal search subsidy. To 
keep the illustration simple and to allow for parameterization of this 
change, we assume a quadratic search cost 

C(\) = (yjk 2 , k > 0. (18) 

This form satisfies the restrictions imposed by (1) and hence can be 
used as a special case for all the results derived so far. Further¬ 
more, the marginal search subsidy can be readily identified with a de¬ 
crease in k. Given that all firms offer the same wage w, let the optimal 
search intensity of unemployed workers in this case be denoted by X° 
= y(w - w 0 ; k). Using (18), (3), and (4), we can establish that 16 d\°ldk 
= dy(w - wo; k)/dk < 0. 

Suppose that, prior to the payment of a marginal search subsidy, 
the equilibrium is the pair (w*, m*) as shown in figure 5. Payment of a 
marginal search subsidy, represented by a decrease in k to k', twists 
the supply function of new hires to the right, when mo is the number 

18 Following the derivation employed in the proof of proposition 1, pt c, it is seen 
that AA* = [w - w 0 + (HAS/2)]/(p + 6 + A n ). Rearranging, we get 

Xo - V4<p + *>)* + (8(w - »<,)/*] - 2(p + h). 
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unemployed. At the old equilibrium wage w*, the new supply exceeds 
the demand for new hires. The new equilibrium will be established at 
a lower wage w*' and a lower level of unemployment m*', as shown in 
figure 5. Furthermore, a decrease in the turnover rate S or a decrease 
in the discount rate p will have the same effect on the equilibrium, as 
produced by a marginal search subsidy payment. 17 

Finally, we briefly indicate the consequence of a shift in the demand 
for new hires. The usual exogenous changes that shift the demand 
curve to the right are an increase in the output price faced by firms 
and an increase in the price of another input (unspecified in the 
model here) that is a substitute for labor. A wage rate subsidy paid to 
employers will also shift the demand curve out. Following arguments 
similar to that presented for the earlier cases, it is easily inferred that a 
shift in the demand for new hires will lead to an increase in the 
equilibrium wage and a reduction in the equilibrium unemployment 
level. 


VI. Conclusion 

This paper studies equilibrium in a labor market wherein it takes time 
for the workers to contact firms. The market model considered here, 
featuring identical workers and homogeneous firms, differs from 
models studied earlier in many ways. First, workers search through¬ 
out their work lives. Second, workers choose their search intensity in a 
costly search environment. Third, there is a particular emphasis on 
the manner in which workers contact different firms, that is, the 
matching technology. Balanced matching, in which workers are more 

IT It is easily seen from the expression in n. 16 that both and dXotdp are 
negative. 
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likely to contact a larger firm, is studied in contrast to random mate 
ing used in earlier models. Firms individually choose their wage ai 
employment level to maximize the expected flow of profits. A uniqi 
equilibrium is shown to exist, wherein all firms offer the same wag 
(at which flow of net hires is zero) and select an employment lev 
such that wage equals marginal profit. 

While most studies in the labor search literature, even in analyzit 
policy issues, focus only on the supply side, because of the sim ( 
nature of the equilibrium derived here, we are able to analyze t 
simultaneous effect of policies on demand and supply. The impact < 
traditional labor market policies on the equilibrium is discussed, alot 
with other empirical implications. Finally, while this paper points 
the importance of the matching technology specification, a comple 
understanding of its role in determining equilibrium outcomes L: 
topic for further investigation. 18 
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Mr. Smith and the Preachers: The Economics 
off Religion in the Wealth of Nations 


Gary M. Anderson 
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The extension of economic analysis to problems beyond the domain 
of formal markets and explicit prices represents a major recent intel¬ 
lectual development. But "economic imperialism" is not new and was 
not invented in Chicago. Adam Smith, in his Wealth of Nations, ex¬ 
tended economic reasoning to a variety of nonmarket exchange 
problems. One example is his analysis of religious behavior. Smith 
viewed participation in religion as a rational device by which individ¬ 
uals enhanced the value of their human capital. He also explained 
the behavior of the clergy and other suppliers of religious services 
from an economic perspective. 


I. Introduction 

The domain of economics has expanded dramatically since Walter 
Bagehot defined it as the "science of business” (see Collini, Winch, 
and Burrow 1983, p. 256). Throughout the nineteenth century and 
afterward, economists only rarely attempted to apply their distinctive 
kinds of reasoning to problems other than those involving exchange 
across well-defined markets in which prices are clearly defined. In 
recent years economists have extended models of rational maximizing 
behavior to problems as diverse as the evolution of the common law 
(Rubin 1977), the organization of the family (Becker 1981), and even 
the internal order of the human body (Ghiselin 1978). This bold 
extension of economic analysis beyond the narrow confines of com- 

I wish to thank David Levy, Fred McChesney, George Stigler, Robert TolJison, Gor¬ 
don TuJJock, and an anonymous referee for useful comments. The usual caveat ap¬ 
plies. 
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mercial exchange has been described by some of its leading propo¬ 
nents as “economic imperialism” (see Radnitzsky and Bemholz [1987} 
for a collection of recent articles on this theme). 

Curiously, this movement had a famous early proponent who has 
been largely ignored by modem practitioners: Adam Smith. Histo¬ 
rians of economic thought have long noted many of Smith’s impor¬ 
tant applications of economic principles to problems of nonmarket 
exchange and the evolution and function of institutions (see particu¬ 
larly Rosenberg 1960; Stigler 1982). Smith was probably the first “eco¬ 
nomic imperialist.” 

Stigler (1982) has stressed Smith’s inconsistency in relaxing his usu¬ 
ally rigorous assumption of rational self-interest when his analysis 
shifted from private exchange to the political process. Stigler argues 
that, although Smith often explained the behavior of politicians and 
the formulation of political policy from the perspective of the self- 
interest of political actors, he also often failed to do so; Smith was 
more willing to ascribe motivations of individuals in the political 
realm to irrationality than he was in the private realm. 

Stigler correctly notes that there are various examples in the Wealth 
of Nations (henceforth abbreviated as Wealth ) in which Smith attributes 
some forms of human behavior to irrationality or myopia, as well as 
cases in which Smith “preaches” rather than restricting himself to 
offering explanation. But he also deserves credit for a bold extension 
of economic analysis into an area of human behavior traditionally 
thought to be beyond the boundaries of economic science: religion. 1 
Even if one accepts that Smith himself had a tendency to preach on 
occasion (at least in his published writings), there is no good excuse 
for the neglect his economic perspective on religious preaching has 
received from historians of economic thought. 

Smith tried to explain why rational self-interested individuals par¬ 
ticipated in religion, on both the demand and supply sides. Combined 
with this approach was a considerable body of analysis of the econom¬ 
ically relevant effects of religious practice and institutions. Particu¬ 
larly interesting was his account of the history of the Catholic church, 
which considered it as a kind of corporate organization. He explored 
the effects of competition as opposed to monopoly in the “market for 
religion,” explained the role of changes in religious institutions on the 
emergence of the commercial society from feudalism, and presented 
an economic theory of the Protestant Reformation. Vet this inter¬ 
esting and important example of Smithian economic imperialism was 
largely ignored by his classical successors and has been seldom noted 

1 See Levy (1978, 1984, 1985), lannaccone (1987, 1988), and McChesney (1987) for 
recent contributions to the economic analysis of religion. 
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in the scholarly literature on Smith and his works. The intent here is 
to help redress this comparative neglect. 

The present paper is divided into five sections. Section II discusses 
the economics of religion, that is, the economic effects of religious 
institutions on social behavior, in Smith’s work. Section III outlines 
the economics of preaching as presented in Wealth. Section IV exam¬ 
ines Smith’s account of the Roman Catholic church as a firm and the 
effects of its legal monopoly franchise in the provision of religious 
services in Western Europe prior to the Reformation. Section V con¬ 
siders Smith’s analysis of the Protestant Reformation from an eco¬ 
nomic perspective. Finally, Section VI concludes the paper with a 
summary of the preceding argument and discusses the significance of 
Smith’s extension of economic reasoning to the study of the role of 
religion in human affairs. 


II. The Economies of Morality and Religious 
Behavior 

In Wealth, Smith was not interested in theological issues or even in the 
. nature of religious belief. 2 Rather, he was concerned with two basic 
problems: (1) the economic incentives involved in the individual’s 
decision to practice religion and (2) the economic effects of different 
systems of religious belief as reflected in individual behavior. He did 
not attempt to develop an economic theory of the emergence of reli¬ 
gious beliefs. However, insofar as such beliefs function as constraints 
on the perception and judgments of individuals, they can be expected 
to produce economically relevant effects. Since beliefs are not directly 
observable, the nature and parameters of such constraints must re¬ 
main subject to untestable speculations. Smith attempted the more 
limited task of defining the logical economic consequences of certain 
kinds of religious belief (e.g., the kinds of behavioral effects likely to 
result from an individual’s belief in the existence of the Christian 
God). On the other hand, religious observance (at least to the extent 
that it is publicly observable) can be analyzed from the perspective of 
relevant (and observable) constraints. The costs and benefits of reli¬ 
gious practice, like the costs and benefits of other forms of publicly 
observable behavior, can be at least identified and possibly measured. 
In short, while Smith did not offer a “general theory” of the economic 
function of religion, he did produce major analytical elements for 
one. He attempted to apply the same principles of economics to 
understanding religious institutions that he applied to the under¬ 
standing of ordinary commercial transactions. 

* This is not intended to imply that Smith was uninterested in strictly theological 
issues (see, e.g.. Smith 1982, pp. 163-70). 
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Smith noted that one of the most economically significant functions 
of religious belief was to provide strong incentives to follow moral 
strictures that helped to support civil society, that is, honesty, benevo¬ 
lence, restraint from violence, and so forth. In the Theory of Moral 
Sentiments (henceforth abbreviated as Theory), Smith explains that the 
concept of a supreme being serves as an enforcement mechanism for 
moral conduct among believers that, in effect, supplements the en¬ 
forcement efforts of secular authorities and complements the other 
incentives that cause individuals to control their own behavior. In part 
III, section 5, he writes the following: 

When the general rules which determine the merit and 
demerit of actions, come thus to be regarded as the laws of 
an All-powerful Being, who watches over our conduct, and 
who, in a life to come, will reward the observance, and pun¬ 
ish the breach of them; they necessarily acquire a new sa¬ 
credness from this consideration.... The sense of propriety 
too is here well supported by the strongest motives of self- 
interest. The idea that, however we may escape the observa¬ 
tion of man, or be placed above the reach of human punish¬ 
ment, yet we are always acting under the eye, and exposed to 
the punishment of God, the great avenger of injustice, is a 
motive capable of restraining the most headstrong passions, 
with those at least who, by constant reflection, have rendered 
it familiar to them. [1982, p. 170] 

The belief in God constitutes a kind of internal moral enforcement 
mechanism. 3 The cost of external monitoring of every individual’s 
behavior all the time is extremely high. Religion provides the basis for 
a system of internalized monitoring that represents an efficiency- 
enhancing adaptation to this problem. The “terrors of religion” 
helped to enforce moral rules and therefore buttress civil society (p. 
164). 

However, while religious belief functions as a significant element in 
self-monitoring by individuals, Smith also suggests (see pt. VI, esp. 
pp. 237-44) that men erect barriers against their own passions as a 
result of a capacity for moral judgment. Religious belief reinforces this 
self-control. 4 

* Elsewhere in Theory, Smith points out that religion performs a monitoring function 
that supplements more mundane social constraints on behavior. In pt. Ill, sec. 2 (pp. 
12(1-21), he explains that the “all seeing judge of the world” functions to punish vice 
and reward virtue, even in cases in which either goes unnoted by others. 

Such self-control was regarded by Smith as a vital requirement for maintaining 
*°dal order. But this did not represent a replacement of economic analyse with the 
preaching of virtue. In the same pt. Ill of Theory he explains that an individual’s 
®«jffity to control his disposition to “anger, hatred, envy, malice, [and| revenge" was 
“*wy to render him “the object of hatred, and sometimes even of horror, to other 
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Smith was also interested in explaining the economic incentives for 
individuals to choose to participate in religious activities. He offered 
an explanation for such behavior based on his theory of the capital 
value of reputation. 

Smith’s theory of human capital is well known. He explained this 
theory, including detailed examples, in Wealth (bk. I, chap. 10, pp. 
116-35). However, this famous discussion involves the human capital 
value of education and does not discuss the human capital value of an 
individual’s reputation. But a number of passages in Wealth and else¬ 
where dearly indicate that Smith understood the capital value of rep¬ 
utation. 

There are several examples in Wealth of analysis based on a human 
capital theory of reputation. 9 In book V, chapter 1, article 2 (p. 760) 
he explains that teachers will apply diligent effort in their occupation 
in order to maintain the quality of their reputations when their 
salaries are a function of their job performance. Again in article 3 of 
that chapter, in a discussion of the remuneration of parochial clergy 
and “those teachers whose reward depends partly... upon the fees or 
honoraries which they get from their pupils’’ he argues that payment 
“must always depend more or less upon their industry and reputa¬ 
tion” (p. 790). He argues that a “man of rank and fortune,” if he 
expects to remain a distinguished citizen, must strictly observe “that 
species of morals . . . which the general consent of this society pre¬ 
scribes” (p. 795). 6 


people*' (p. 243). Self-control was ordinarily a necessary strategy to maximize the value 
of an individual’s reputational human capital (see below). But such a command of the 
passions was not a sufficient condition for wealth or success (pp. 238-39). 

6 In the set of lectures delivered at Glasgow in 1766, as reported in Lectures on 
Jurisprudence, Smith describes circumstances under which damage to an individual’s 
reputation constitutes an injury, explaining that a man’s reputation will be injured 
"either by falsely representing him a thief or robber, or by depreciating his real worth, 
and endeavouring to degrade him below the level of his profession. A physician’s charac¬ 
ter is injured when we endeavour to persuade the worid he kiiis his patients instead of curing them, 
for by such a report he loses his business" (1978, p. 399; emphasis added). In fact, in a 
famous letter to William Cullen (dated September 20,1774), Smith argued that restric¬ 
tions on the sale of degrees in medicine by Scottish universities were unnecessary 
because the demand for physicians’ services (and consequently their incomes) was a 
function of the quality of their reputations for skill in their profession (see Mossner and 
Ross 1977, pp. 173-79). 

* Surprisingly, Smith's dearest and most concise statement of a human capital view of 
moral behavior was not in Wealth, but instead in Theory. In pt. Ill, chap. 5, he writes the 
following: 

If we consider the general rules by which external prosperity and adversity 
are commonly distributed in this life, we shall find (that] every virtue naturally 
meets with its proper reward, with the recompense which is most fit to encour¬ 
age and promote it; and this too so surely, that it requires a very extra ordinary 
concurrence of circumstances entirely to disappoint it. What is the reward most 
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Religions tend to produce and distribute moral information about 
individual members. Moral information—that is, information with 
respect to an individual’s moral history—is valuable to the extent that 
it provides potential transactors with information permitting them to 
assess the risks associated with a given exchange. To the extent that 
moral duties are perceived in the market as relevant to assessing the 
riskiness of potential transactions, an individual’s moral reputation 
has a capital value; in an efficient human capital market the social cost 
of immoral behavior that is judged economically relevant will be fully 
reflected in reduced capital value of the individual’s reputation. 
Hence, given an efficient human capital market, economically rele¬ 
vant morality becomes self-enforcing because individuals bear indi¬ 
rect costs of their misbehavior. 

In a passage in Wealth Smith explains this monitoring function 
performed by religious orders and the consequent effect on the repu¬ 
tational human capital of members: 

[The man of low condition] never emerges so effectually 
from this obscurity, his conduct never excites so much the 
attention of any respectable society, as by his becoming the 
member of a small religious sect. He from that moment ac¬ 
quires a degree of consideration which he never had before. 
AH his brother sectaries are, for the credit of the sect, inter¬ 
ested to observe his conduct, and if he gives occasion to any 
scandal, if he deviates very much from those austere morals 
which they almost always require of one another, to punish 
him by what is always a very severe punishment, even where 
no civil effects attend it, expulsion or excommunication from 
the sect. In little religious sects, accordingly, the morals of 
the common people have been almost always remarkably 
regular and orderly; generally much more so than in the 
established church. [1979, pp. 795-96] 

Smith emphasizes that these "small religious sects” are voluntary 
associations, whose chief form of sanction of their members is expul- 


proper for encouraging industry, prudence, and circumspection f Success in every sort of 
harness. And is it possible that in the whole of life these virtues should fail of attammg 
itl Wealth and external honours are their proper recompense, and the recompense which 
they tan seldom fail of acquiring. [1982, p. 166; emphasis added) 

In a similar vein, Smith writes in pt. I, sec. 5 (p. 65), that in “the middling and inferior 
stations of life, the road to virtue and that to fortune... are, happily in most cases, very 
nearly the same” [see also the discussion of this passage in Rosenberg [1984]). In pt. VI, 
*ec. 5 (esp. pp, 224—25), he notes that individual good conduct and behavior tend to be 
rewarded with friendship and- benevolence from others in society. 
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The teachers of each sect, seeing themselves surrounded on 
all sides with more adversaries than friends, would be 
obliged to learn that candour and moderation which is so 
seldom to be found among the teachers of those great sects, 
who [as a result of legal entry restrictions facing competing 
sects] ... see nothing round them but followers, disciples, 
and humble admirers. The teachers of each little sect. . . 
would be obliged to respect those of almost every other sect, 
and the concessions which they would mutually find . . . 
convenient and agreeable . . . might in time . . . reduce the 
doctrine of the greater part of them to that pure and rational 
religion, free from every mixture of absurdity, imposture, or 
fanaticism .... This plan of ecclesiastical government, or 
more properly of no ecclesiastical government, [would tend to be] 
productive of the most philosophical good temper and mod¬ 
eration with regard to every sort of religious principle. [P. 
793; emphasis added] 

Incidentally, this passage is probably the closest Smith comes in 
Wealth (or elsewhere) to arguing in favor of free-market anarchism. 10 

Smith implicitly suggests that the quality of religion can be objec¬ 
tively ascertained and evaluated, as in the case of any other good or 
service. He did not view all religion as equally irrational; different 
types of religious doctrine have different effects on the behavior of 
individual believers and hence on the operation of the economic sys¬ 
tem. “Pure and rational religion, free from every mixture of absur¬ 
dity, imposture, or fanaticism . . . productive of the most philosoph¬ 
ical good temper and moderation” is optimal in the sense that it is 
most consistent with the efficient operation of the economic system, 
because it tends to produce changes in individuals that facilitate their 
participation in the contractual order of the market economy. “Philo¬ 
sophical good temper and moderation” are necessary prerequisites 
for the peaceful functioning of the division of labor; “the grossest 
delusions of superstition” (see bk. V, chap. 1, p. 803) are inconsistent 
with the rapid progress in pure and applied science. 

This constitutional analysis of religion anticipates the work of 
Weber (1958) over a hundred and thirty years later about the role of 
the rise of Protestantism (and, more specifically, Calvinism) in provid¬ 
ing a moral (or, in Smith’s terminology, “constitutional”) basis for the 
rapid expansion of capitalism in the West following the Reformation 


10 While this is undoubtedly one of the best presentations of the efficient markets 
argument in Wealth, it has been largely overlooked by Smithian scholars. An importan 
exception is Levy (1978, pp. 671-72), who examines the passage in detail. 
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(see Weber 1958, pp. 47-78), Smith assumed that the doctrines of the 
Roman church impeded the development of capitalism by promoting 
anticommercial attitudes and barriers to trade. The Roman church is 
described as “the most formidable combination that ever was formed 
against the authority and security of civil government, as well as 
against the liberty, reason, and happiness of mankind, which can 
flourish only where civil government is able to protea them” (1979, 
pp. 802-3). Interestingly, Smith’s anticipation of Weber’s famous ar¬ 
gument has gone unacknowledged in the debates over the latter’s 
thesis. 11 

In sum, Smith argued that moral codes were at least partially self- 
enforcing and that religion served as a voluntary portion of this self¬ 
enforcement mechanism. 12 The public perception of an individual’s 
moral character influenced that person’s expected future income 
stream; hence, an income-maximizing individual had an economic 
incentive to participate in organized religion. In other words, at least 
some forms of religious behavior were economically rational. 

III. The Economics of Preaching 

Although Smith frequently reiterated his assumption that political 
decision makers were rational self-interested economic actors, he 
nevertheless sometimes insisted on proffering to politicians and other 
national rulers advice on how to improve the operations of govern¬ 
ments and the value of governmental assets (see Stigler [1982] for 
several examples). In other words, Smith the economist sometimes 
acted like Smith the preacher.' 3 

However, it is also true that Smith often explicitly recognized that 
ideas were sometimes promulgated at the behest of particular groups 


11 Some modern scholars lake a different view of the relationship between the Ro¬ 
man Catholic church and the rise of capitalism. Berman (1983. PP- 336-39) argues that 
the Roman church was not hostile to either the accumulation of wealth by individuals or 
market exchange and that the canon law provided much of the basis for the mercantile 
law of the late Middle Ages. But one need not necessarily endorse the empirical validity 
ol Smith's argument to note that it represented an important analysis of the role of 
institutional constraints—in this case, religious doctrines—on the growth and develop¬ 
ment of the economic system. 

| j See Levy (.1985, pp. 115-20) for a related discussion. 

' Given Smith's insistence on the vital role of persuasion in all human exchange 
(1978, p. 352) and his related belief that historical analysis (which would include the 
analysis of public policy) was necessarily a rhetorical enterprise (i.e., involving competi¬ 
tion between alternative explanatory theories [1985, p. 89, lecture 171), this tendency 
on his part is not surprising. Economic action necessarily involves persuasion, and 
ccononuc txpianatum is itt part an attempt to cimvmet. To paraphrase in modern ter- 
minology, positive economic analysis becomes accepted to the extent that its practition¬ 
er* are successful intellectual entrepreneurs. 
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who expected to benefit economically as a result; that is, political ideas 
were occasionally a device by means of which political policies were 
marketed by their beneficiaries (see Anderson [1990] for examples). 
Interestingly, some of the most interesting examples of this economic 
analysis of “preaching” in Wealth concerned religious preaching, that 
is, the clergy as an interest group. 

Smith’s assessment of the relative importance of ideas as motivating 
factors in religious behavior is lucidly summarized in a passage that 
appears in the course of a discussion of the history of the Roman 
church during the time of the Crusades. In reference to the “constitu¬ 
tion” of the church, he writes that 

the grossest delusions of superstition were supported in such 
a manner by the private interests of so great a number of 
people as put them out of all danger from any assault of 
human reason: because though human reason might per¬ 
haps have been able to unveil, even to the eyes of the com¬ 
mon people, some of the delusions of superstition; it could 
never have dissolved the ties of private interest. Had this 
constitution been attacked by no other enemies but the fee¬ 
ble efforts of human reason, it must have endured forever. 
But that immense and well-built fabric, which all the wisdom 
and virtue of man could never have shaken, much less have 
overturned, was by the natural course of things, first 
weakened, and afterwards in part destroyed, and is now 
likely, in the course of a few centuries more, perhaps, to 
crumble into ruins altogether. [1979, p. 803] 

The rhetoric of religious belief is interpreted as an expression of 
private interests, and the influence of a particular body of religious 
belief is hypothesized to change as a function of changes in the com¬ 
position of relevant economic interests. He argues that the particular 
religious doctrine in question (promulgated by the organized interest 
group supporting the church) was unaffected by the arguments of 
intellectual critics and in fact could never have been significantly af¬ 
fected merely by the ideas of rationalist opponents like himself. 

“The clergy of every established church,” Smith writes in Wealth, 
“constitute a great incorporation.” He continues, in one of the most 
remarkable paragraphs in the book, that they “act in concert, and 
pursue their interest upon one plan and with one spirit, as much as if 
they were under the direction of one man; and they are frequently 
too under such direction” (p. 797). In Smith’s view, there is little in the 
way of economically relevant distinction between this particular corpo¬ 
ration (e.g.* guild) and those organized around other occupations 
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engaged in more mundane pursuits. The clergy is basically just an¬ 
other interest group. 14 

By "established" religion, Smith meant one protected by law from 
the competition of other sects (he also used the more descriptive 
phrase “an established or governing religion" [p. 797; empha« »s 
added]). Although he obviously intended for his observations con¬ 
cerning clerical behavior to be generally applicable to other sects, the 
primary source of examples in article 3 is the Roman church. 

The self-interest model was at the heart of Smith’s analysis of the 
behavior of the suppliers of religion. He states this very dearly: 

In the church of Rome, the industry and zeal of the inferior 
clergy is kept more alive by the powerful motive of self- 
interest, than perhaps in any established protestant church. 
The parochial dergy derive, many of them, a very consider¬ 
able part of their subsistence from the voluntary oblations of 
the people; a source of revenue which confession gives them 
many opportunities of improving. The mendicant orders 
derive their whole subsistence from such oblations. It is with 
them, as with the hussars and light infantry of some armies; no 
plunder, no pay. [Pp. 789—90; emphasis added] 15 

The self-interest of the clergy has a tangible economic mani¬ 
festation in the form of the benefices of the clergy, “a sort of freeholds 
which they enjoy ... during life” (p. 798). The benefices were granted 
(in the Catholic church) by the bishops, "who bestowed them upon 
such ecclesiastics as [they] thought proper” (p. 800). The dergy is 
claimed to have actively engaged in rent seeking in order to obtain 
benefices from the bishops: “The ambition of every clergyman natu¬ 
rally led him to pay court... to his own order, from which only he 
could expect preferment" (p. 800). 16 


M Modern scholars agree that the medieval dergy had many of the characteristics of 
an organized interest group. Berman (1983, p. 108) maintains that the dergy “became 
the first translocal, transtribal, transfeudal, transnational class in Europe to achieve 
political and legal unity.” It became so by “demonstrating (hat it was able to stand up 
against, and defeat, the one preexisting universal authority, the emperor” (Henry IV, 
the Holy Roman Emperor, who unsuccessfully resisted the expanding power of the 
papacy in a conflict with Pope Gregory VII at the end of the eleventh century). The 
dew was a powerful and well-organized political force. 

*n the late Middle Ages (and, indeed, even in Smith's day) mercenary soldiers were 
common, and plunder was typically an important part of a soldier's income, even in 
national armies. On the mercenary nature of early armies and plunder as an important 

part of their remuneration, see McNeill (1982, pp. 105-7) and Corvisier (1979, pp. 41- 
46). 

14 Smith also provided an interesting argument about the effect of the relative mag¬ 
nitude of benefices on the efficiency in the allocation of resources in society. In some 
countries church benefices provided larger incomes that university chain. Conse- 
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The. link between Hume and Smith may be important for under¬ 
standing the hitter's intent in his discussion of the Roman church. As 
noted previously, Hume favored an established religion because he 
believed that such an institution tended to mute the fanaticism some¬ 
times associated with independent religious sects. In a related argu¬ 
ment, Hume expressed his (relative) approval of the Catholic religion 
because he interpreted it to be a less “metaphysicar brand of religion 
than the Protestant-Puritan sects, which he labeled “enthusiasms” and 
which he claimed inspired political fanaticism (see Livingston 1984, 
pp. 315-16). Smith's opposing view, that free markets and not state 
monopoly tended to provide optimal religious institutions, naturally 
focused on the history of the Roman church (whose history of over a 
thousand years was well known) for examples supporting his thesis. 
However, given the dose connections between the two men, it is also 
possible that Smith’s concentration on the Catholic church was in pan 
a rebuttal to Hume’s favorable pronouncements on the same institu¬ 
tion, based on his diametrically opposing position. 


IV. The Church as a Corporation 

Smith’s account of the complex fiscal organization of the Roman 
church is reminiscent of a large modern corporation. In fact, the 
financial organization of the medieval church is reminiscent of a large 
system of corporate franchise. Clerics paid rent to the Vatican for the 
continuing privilege of operating benefices; although this rent was 
described as a “tax” (and took a wide range of specific forms that 
varied across Europe and over time), clerics holding “franchises" 
(which induded parishes, bishoprics, ecclesiastical foundations, 
monasteries, and other organizations) who failed to pay up were re¬ 
placed (Lunt 1965, pp. 57-77). These franchises obtained revenues 
through either the normal operations of the productive resources at 
their disposal (for instance, the sale of agricultural output and the 
receipt of rent in the case of estates), voluntary contributions from the 
faithful, or the sale of indulgences and other services. 


quently, as in England, “the church is continually draining the universities of all their 
best and ablest members” (p. 811). But as members of the church, scholars arc unavail¬ 
able as teachers, which causes the quality of the individual’s "learning and knowledge” 
to relatively decline, given that teaching is "the most effectual method for rendering 
him compieitiy master of” bis subject (p. 812). If the returns from benefices are 
relatively high, a social loss from misaliocation of intellectual resources tends to result. 
The corollary was that relatively low-income benefices had the opposite effect: "The 
mediocrity of church benefices naturally tends to draw the greater part of men ot 
letters, in the country where it takes place, to the employment in which they can bethe 
meat useful to the pubUck, and, at the same time, to give them the best ed uca tion, 
perhaps, they are capable of receiving. It tends to render their learning both as soua » 
poMsUe, and as useful as possible" (p. 812). 
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The central financial administration of the papacy was called the 
camera (chamber). The bulk of papal income flowed into this office. 
During the pontificate of John XXII, these receipts appear to have 
averaged about 228,000 florins a year. In the middle years of the 
fourteenth century the annual receipts of the camera varied from 
130,000 to 335,000 florins. In the first half of the fourteenth century 
the exchange rate of florins was around six to the pound sterling; 
therefore, the average annual papal income between 1316 and 1362 
varied between about £24,000 and £42,000. For purposes of compari¬ 
son, the average annual income of the English king during the reigns 
of Edward II and Edward III (1307-77) ranged from £35,000 to 
£272,000 and averaged £91,000 during the reign of Edward II and 
£140,000 during the reign of Edward III (Lunt 1965, pp. 13-14). 
Camera! receipts continued to grow steadily until the end of the 
fifteenth century. 

The organization of the church maintained a complex and elabo¬ 
rate system for distributing this economic rent, directed by the pope, 
whose financial role could be likened to a modern-day corporate 
chairman of the board. In Smith’s day, the only nongovernmental 
organizations that could begin to compare with the Roman church in 
size and complexity were the large foreign trading companies, such as 
the English East India Company. 17 

The Roman Catholic church in the period prior to the Reformation 
was the monopoly supplier of religion in Europe. Competing entrants 
in the supply of religion were defined as “heretics” and systematically 
persecuted. Competitors were simply burned at the stake or otherwise 
forcibly prevented from marketing their services to consumers. 
Smith's principal objection to the Roman church was its coercive mo¬ 
nopoly in the market for religion. 18 The Roman church was a kind of 


17 Although space precludes more than a brief summary of these various sources of 
incomes here, the complex financial affairs of the medieval church are extremely 
interesting in their own right. The papacy taxed various ecclesiastical bodies such as 
monasteries and sold the protection of St. Peter for the temporal possessions of kings, 
princes, lords, and cities (Lunt 1965, pp, 61-62, 63). Income taxes were sometimes 
imposed on the clerical subjects of the popes, usually in connection with (and ostensibly 
for the purpose of supporting) a Crusade (pp. 71 -73). In the late eleventh century, the 
•ale of indulgences—partial or total pardon of the penance required for forgiveness of 
sin—began to become an important source of revenue, by 1500 accounting for perhaps 
as much as one-half of total revenue or more (pp. 111-15,123-24). The sale of offices 
at the papal court, which began during the pontificate of Boniface IX (1389-1404), also 
eventually became a major source of revenue. In addition to these, there were many 
other forms of revenue that were of at least minor significance during this period. 

Elsewhere in Wealth, Smith criticized Hobbes’s doctrine that wealth was equivalent 
to power. In bk. I, chap. 5, Smith argued that wealth might be used to purchase political 
power but did not necessarily lead to such power (p. 48). This is important because it 
}®pK** that Smith did not confuse the church’s wealth with its legal monopoly privi- 
"***> which that wealth allowed it to buy and maintain. Thus he was offering an 
economic argument and nof a diatribe against clerical riches. 
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spiritual equivalent of the East India Company monopoly, which 
Smith had extensively analyzed in the immediately preceding section 
in hook V, chapter 7, in Wealth. Although he did not himself explicitly 
draw this analogy, the analogy itself is striking. Like the East India 
Company, the church was a far-flung, international enterprise with a 
highly complex and centrally directed organizational structure. Al¬ 
though Smith was more interested in the consequences of the com¬ 
pany’s monopoly franchise, it was one of the most complex and 
sophisticated business organizations of the pre-twentieth-century pe¬ 
riod. 19 Both the company and the church had the legal right to pro¬ 
hibit the entry of potential competitors into their respective markets. 
Both organizations took on numerous governmental attributes and 
responsibilities while remaining outside constitutional or electoral 
constraints on their behavior. And like the company, the church’s 
monopoly produced a significant loss to society. 

Smith did not present a notion of monopoly welfare loss in pre¬ 
cisely modern terms. But he did argue that monopoly leads to restric¬ 
tions on output, or declines in quality, combined with increases in 
price, and therefore harms consumers (see Anderson and Tollison 
1982). It is in this context that Smith’s passage about the “constitu¬ 
tion” of the Catholic church supporting the “grossest delusions of 
superstition” (1979, p. 803) should be interpreted. Smith was, in ef¬ 
fect, accusing the monopoly church of reducing the quality of religion 
supplied to consumers, whose welfare was reduced as a result. In the 
same passage he clearly attributes this quality reduction to the self- 
interested behavior of the clergy, who extracted monopoly rent from 
their flock both directly and indirecdy by promulgating irrational 
doctrines that served their own interests. The consumers of religion 
were badly served by the monopoly purveyor of spiritual guidance, 
just as in the case of monopolies in the provision of more mundane 
goods. 

During the feudal period, the church represented a transnational 
corporation that was considerably more powerful than any single na¬ 
tional government (see Anderson [1987] for a more detailed ac¬ 
count). Not only did some prelates have as many retainers as “the 
greatest lay-lords,” but the total number of retainers of all the clergy 
together was probably greater than the total number of retainers 
maintained by feudal barons (Smith 1979, p. 801). Perhaps more 
important, the church functioned as a unified and cohesive organiza¬ 
tion .' "The [clergy] were under a regular discipline and subordination 
to the papal authority. The [lay-lords] were under no regular disci- 

19 Anderson, McCormick, and Tollison (1983) analyze the East India Company from 
the perspective of the modern theory of the Arm and argue that the company was 
probably the first true multidivisional corporate enterprise. 
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pline or subordination, but almost always equally jealous of one an¬ 
other, and of the king. Though the tenants and retainers of the 
clergy, therefore, had both together been less numerous than those of 
the great lay-lords, and their tenants were probably much less numer¬ 
ous, yet their union would have rendered them more formidable” 

(p. 801). 

In feudal times (‘‘the antient state of Europe”), “the wealth of the 
clergy gave them the same sort of influence over the common people, 
which that of the great barons gave them over their respective vassals, 
tenants, and retainers" (p. 800). The church acquired great landed 
estates through “the mistaken piety both of princes and private per¬ 
sons.” The church maintained feudal manors, presided over by clergy 
(usually bishops), essentially acting as barons. The clergy collected 
rents from their tenants and in addition collected tithes from the 
rents of all the other estates in every kingdom of Europe. 

Smith noted on several different occasions that the church and 
national states were economic competitors. 20 Most important, they 
competed for the same tax base. He argues that, regardless of the 
exact nomenclature employed by the church to describe the income it 
collected from contributors, these sources were essentially taxes. Rev¬ 
enue extracted from the community by the church reduced the po¬ 
tential revenue available for government. But also, by operating a 
parallel system of justice—the ecclesiastical courts—the medieval 
church competed with the state in the provision of public order. 
Smith was particularly interested in one feature of this system of 
competing jurisdictions: the privileges of clergy. Members of the 
clergy were exempt from secular jurisdiction and could not be tried 
except before an ecclesiastical court. 2i This made the clergy at least 
partially independent of governmental authority. 22 


ao Smith arguei that the "hospitality and charity" of the dergy both gave them “great 
temporal force” and increased the weight of their "spiritual weapons.” As a result, the 
clergy enjoyed powerful popular support and were therefore protected from the au¬ 
thority of governments (p. 802). 

1 This account of the “privilege of clergy," according to which the clergy are de¬ 
scribed as having something similar to diplomatic immunity, was exaggerated. In prac¬ 
tice (in England at least), the privilege was not always successfully invoked by individ¬ 
uals who otherwise were probably eligible, and secular courts sometimes refused 
(usually indirectly) to recognize, and otherwise evaded, the nominal protection (see 

Bellamy 1984, pp. 116-17). 

Consider Smith’s account of the church in feudal times: “The privileges of the 
J T gY [included] their total exemption from the secular jurisdiction ... what in En¬ 
gland was called the benefit of clergy .... How dangerous must it have been for the 
sov erei{pi to attempt to punish a clergyman for any crime whatever, if his own order 
were disposed to protect him, and to represent either the proof as insufficient for 
convicting so holy a man, or the punishment as too severe to be inflicted upon one 
P«Mn had been rendered sacred by religion" (1979, p. 802). The deigy were 
m P«y ^dependent of the sovereign, as agents of what might oe described as a ‘parallel 
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It would be only a slight exaggeration to suggest that die image of 
the church that emerges from his discussion resembles nothing in the 
modern world so much as the Mafia. like the Mafia, the church is a 
supra- or transnational organization whose members, although they 
live within the boundaries of one or another nation-state, owe their 
primary allegiance to that organization. Alternatively (and less pro¬ 
vocatively), the church can be described as in many respects analo¬ 
gous to a modem multinational corporation. Whichever analogy is 
deemed most appropriate, the Roman church had historically been a 
rival power to the national government in England and in other West¬ 
ern European countries. 

However, Smith argued that European governments had brought 
this problem on themselves by way of their policies toward religious 
markets and that the lifting of existing restrictions on religious com¬ 
petition, and not further intervention, would resolve the dilemma. 
Organized religion would not constitute a threat to the authority of 
government if the latter would refrain from granting any one religion 
a legal monopoly: 

In a country where the law favoured the teachers of no 
one religion more than those of any other, it would not be 
necessary that any of them should have any particular or 
immediate dependency upon the sovereign or executive 
power; or that he should have any thing to do, either in 
appointing, or in dismissing them from their offices. In such 
a situation he would have no occasion to give himself any 
concern about them, further than to keep the peace among 
them, in the same manner as among the rest of his subjects; 
that is, to hinder them from persecuting, abusing, or oppress¬ 
ing one another. But it is quite otherwise in countries where 
there is an established or governing religion. The sovereign 
can in this case never be secure, unless he has the means of 
influencing in a considerable degree the greater part of the 
teachers of that religion. [P. 797] 

State intervention in religious markets tended to undermine the au¬ 
thority of gbyemment, and the sovereign would increase his own 
security of tenure by eliminating restrictions on religious competi¬ 
tion.* 3 

Smith argued that this competition between the jurisdictions of the 


** Note that Smith is not daunting that the establishment of religion is necessanty 
inconsistent with die interests of partiailar individual sovereigns, but simply that in the 
long.run, fvtun (individual) sovereigns would be better off tf religious markeu were 
keptopen am! unrestricted. 
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and the state reduced the ability of government to efficiently 
ie public order (“all other things being supposed equal, the 
• the church,. .. the less able must the state be to defend itself” 
2]). But why is monopoly to be preferred to competition in this 
If the church was in some respects a “competing government,” 
was competition in the provision of governmental services less 
;nt than competition in the provision of any other goods and 
:es? Smith noted the “welfare loss” resulting from monopoly in 
ipply of religious services and even argued that free competition 
- market for religion produced optimal outcomes. Those who, 
mith, favor free competition except with respect to the provision 
iblic goods must be prepared to justify this exception. Smith 
ently did not perceive this need. 

s question aside, Smith proceeded to consider the long-run con- 
nces of the competition between church and state. He suggested 
1 the long run both parties unintentionally promoted economic 
ipment as a by-product of their own self-seeking behavior. He 
s that the church played an important (if unintentional) role in 
ansition from feudalism to the market society. This analysis is 
ined in Section V. 


The Protestant Reformation from an 
xonomic Perspective 

■d above that Smith argued that the constitution of the church of 
:, “that immense and well-built fabric,” was in the end over- 
d not because of the “feeble efforts of human reason." but in- 
because of economic factors. In fact, Smith included in chapter 
extensive discussion of the causes of the religious and social 
;es that culminated in the Reformation. He developed a careful 
imic analysis of these institutional changes in which theological 
tes play no important role. 

ley element in this discussion is Smith's hypothesis that, in the 
live economic order of feudal Europe, there were only very 
J consumption options available to the holders of wealth. Be- 
of the extremely limited range of consumer goods available, the 
hy had few options but to spend their resources on providing 
selves with large numbers of retainers or followers: consumption 
manpower intensive” (this theme in Smith has been explored by 
iberg [I960]). 24 The clergy spent much of their wealth on “char- 
vhich Smith portrayed as thinly disguised influence peddling. 

similar argument (Smith 1979, p. 421) applied to the declining power of the 
baron* ha* recently been reviewed by Winch (1978, pp. 77-78). 
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But the gradual emergence of a market economy in Europe disturbed 
the equilibrium of the feudal age: 

The gradual improvements of arts, manufactures, and 
commerce, the same causes which destroyed the power of 
the great barons, destroyed .. . through the greater part of 
Europe, the whole temporal power of the clergy. In the pro¬ 
duce of arts, manufactures, and commerce, the clergy, like the 
great barons, found something for which they could exchange their 
rude produce, and thereby discovered the means of spending their 
whole revenues upon their own persons, without giving any consid¬ 
erable share of them to other people. Their charity became gradually 
less extensive, their hospitality less liberal .... The ties of interest, 
which bound the inferior ranks of people to the clergy, were 
in this manner gradually broken and dissolved. . . . On the 
contrary, [the poor] were provoked and disgusted by the 
vanity, luxury, and expence of the richer clergy. [1979, pp. 
803—4; emphasis added] 

This provocative passage includes the basic elements of a self- 
interest theory of charity (emphasized above), one of Smith’s more 
radical extensions of the self-interest model that has been widely ne¬ 
glected. Obviously, Chicago economics was not invented in Chicago. 

Smith also identified an interesting intertemporal public-goods 
problem: the rational, self-interested behavior of the (present) clergy 
harmed the credibility of the church as a whole and therefore im¬ 
posed negative externalities on, and reduced the incomes of , future 
clergy. 

As the temporal power of the clergy was reduced, “the sovereigns 
of the different states of Europe” took advantage of the developing 
weakness by passing a series of laws (which the clergy was unable to 
oppose effectively) that granted the monarchs significant influence 
over the selection of bishops. This gradually provided the monarchs 
with the ability to control the flow of church benefices since the 
bishops were effectively the local managers of church assets. “As the 
clergy had now less influence over the people, so the state had more 
influence over the clergy. The clergy therefore had both less power 
and less inclination to disturb the state” (p. 805). 

In short, the power of the clergy declined in the later Middle Ages 
as a direct result of the forces of economic development, which led to 
a shift in the balance of power between church and state in favor of 
the latter. Monarchical rent seeking weakened the organization of the 
church by preying on its revenue base and rendering it vulnerable to 
competitive entry from other religious sects. 

While the “authority of the church of Rome was in this state of 
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ision,” the Protestant Reformation occurred. Smith does not 
•n himself with the religious beliefs and disagreements that 
he ostensible motivating forces behind this movement for re- 
Instead, he portrays this development from an economic per- 
>e: as the competitive entry of a new religious faction. 
Protestant Reformation gained a foothold, and grew in size 
ifluence, because European monarchs skillfully exploited this 
>us schism for their own gain in their conflict with Rome: “The 
1 of the new doctrines was almost every where so great, that the 
- who at that time happened to be on bad terms with the court 
ne, were by means of them easily enabled, in their own domin- 
3 overturn the church, which, having lost the respect and vener- 
if the inferior ranks of people, could make scarce any resis- 

(p. 806). 25 

in, this original analysis of the economic factors that influenced 
is generally in accordance with historical fact. Even before the 
nation, the growing power of national governments had led to 
:ant concessions to temporal rulers on the part of the church 
ladwick 1972, pp. 25—26). Smith’s economic explanation of the 
uid losses to political decision makers as factors in the motiva- 
' the events surrounding the Reformation was based on a solid 
ation. 


Conclusion 

Stigler (1982) is correct in noting that Smith tended to deliver 
ns on occasion, it is also the case that some of the most inter- 
and original analysis in Wealth was that applied to the profes- 
' preaching. 

•e were five major elements of note in Smith’s economic ap- 
to religion: (1) he offered a theory explaining the participa- 
' individuals in religion based on his theory of human capital, 
modeled the suppliers of religion as self-interested income 
tizers, (3) he extended his theory of competitive markets to the 
• of religion, (4) he analyzed the church as a kind of firm and 
d much attention to the economic effects of its monopoly in the 
; Ages, and (5) he attempted to show how the self-interest of 
:rgy and political leaders interacted with economic growth and 
ament. 

importance attached by Smith to the role of the growth of 

: leader* of die Protestant Reformation "rewarded” monarchs of countries who 
'Wted their movement by conceding to them the right to determine the “dis- 
aB the bishopricks, and other consiatorial benefices" within their dominion*, 
»g the monarch “the real head of the church" (p. 807). 
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commerce and manufactures in promoting "order and good govern¬ 
ment, and with diem, the liberty and security of individuals*’ (1979, p. 
412) has been rightly noted in the previous literature (see, e.g., 
McFarlane 1982, p. 14; Rosenberg 1984, p. 21), The expansion of the 
commercial society generated changes of profound significance in the 
moral order of society. The intent here is not to disparage the relative 
importance of these factors in Smith’s work, but rather to suggest that 
he recognized religion as also playing a vital role in creating the 
“moral constitution” of the market economy. 

To Smith, religion helped to both generate and maintain the moral 
order that was a prerequisite for the commercial society. State- 
sponsored monopoly religion performed this function less efficiently 
than free competition could. 26 In this way, Smith linked religious free¬ 
dom to economic freedom. An economy based on capitalist individ¬ 
ualism was best served by religious movements that emerged from 
free markets themselves. 

A widely accepted “stylized fact” (even among economists) is that 
Adam Smith was the “first economist." Naturally, all students of the 
history of economic, thought recognize this as a gross distortion. How¬ 
ever, many also fail to appreciate that Smith may well have been the 
first economic imperialist. It is hoped that future Smith scholarship will 
increasingly recognize this aspect of his work. 
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Comments 


The Probability of Gross Violations of a 
Present Value Variance Inequality 

Robert J. Shiller 

Yale University 


One way of evaluating the simple efficient markets model of stock 
prices is to compare the volatility of the “perfect-foresight” price per 
share Pf with that of the actual price per share P t . The perfect- 
foresight price Pf is the present value at time t of actual future 
dividends D,+ k , k ^ 0, discounted at constant rate r. The simple 
efficient markets model asserts that P, = £*Pf, which implies (if de¬ 
trended series p, and pf are stationary) that a?(p*) & cr 2 (p). I found 
(Shiller 1981) evidence suggesting gross violation of this and other 
such variance inequalities. With annual U.S. date from 1871-1979, 
s(p)/s(p*) = 5.59. (Here, o and s denote population and sample stan¬ 
dard deviations, respectively.) 

Allan Kleidon (1986) criticizes such methods of evaluating effi¬ 
ciency. By Monte Carlo methods he computes the probability that 
s(p)/s(p*) > 5 with 100 observations under the simple efficient markets 
hypothesis and assuming that log(A) = P + log(D,_j) + e,, where e, is 
independent normal N(0, a 2 ) and p is a constant. He finds that the 
probability may be substantial, as high as .397. 

In each of his 1,000 Monte Carlo iterations, Kleidon first generated 
100 observations of a random normal variable with p = .0095 and o 
= -218. He cumulated and exponentiated these to produce a real 
dividend series D t . He produced a price series from the dividend 
series by multiplying by (1 + g)/(r - g), where r is the discount factor 
and g is the expected growth of dividends, equal to exp[p + (<r 2 /2)]. 
Duplicating the procedure in my original paper, he estimated a trend 
by regressing ln(P ( ) on time t. With the slope coefficient b in this 
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regression, he produced detrended series d, = D, exp( - bt) and p, = 
P t exp(~bt). He then produced detrended pt recursively with pf a 
(p?+i + d,+i)f(l + r) working backward from pf = p T , where T is the 
terminal date. In his table 2 he provides statistics on s(p)/s(p*) in the 
1,000 iterations. I succeeded in replicating his results (table 1, case A). 

There are two respects in which Kleidon’s calculations of this prob¬ 
ability are possibly misleading or incorrect. First, he does not adjust 
his discount factor r for detrending. Second, and more important, he 
uses assumptions that produce unrealistic dividend-price ratios. 

For the first point note that the recursion that I used in my original 
paper to generate pf was pf = (pt+i + d,+ i)/[(l + r)e'xp(- 6 )]. With 
my recursion, the model P, - EJPf implies p, = E<pf. With his, it does 
not. He is in effect using a higher discount rate for p* than for p, 
which should have the effect of creating spurious volatility in his 
simulations ior p. Case B of table 1 redoes Kleidon’s simulations with 
this change in the recursion. The probability of “gross violations." 
column 2 , is reduced somewhat. 

For the second point, note that we can estimate the discount factor r 
in the model merely by taking the average real return in the market 
Rt. With the annual U.S. data in my original paper for 1871-1979, the 
mean of R is 7.848 percent with a standard error of 1.711 percent. 
Kleidon’s choices for his tables of r = 5 percent, 6.5 percent, and 7.5 
percent are thus a litde on the low side but are not unreasonable: 
none is more than two standard errors below the mean. However, the 
implied constant dividend-price ratios (r - g)/( 1 + g) are often un¬ 
reasonable: with his estimate of g of 3.382 percent, these are 1.565 
percent, 3.016 percent, and 3.983 percent, respectively. The mean 
actual dividend-price ratio with this sample is 5.138 percent with a 
standard error of 0.389 percent based on an AR(1) model for the 
dividend-price ratio. His dividend-price ratios are extreme because as 
he varies r in his simulations he holds g constant. Yet most of the 
uncertainty about the population mean of R, (P t +\ - P t )IP t + 
{JDJPti is due to uncertainty about the rate of growth, the mean of 
(P t +i - PtVPh rather than to uncertainty about the mean of the divi¬ 
dend-price ratio, D t /Pf When he varies r he should vary p. and hence g 
as well, holding (r - g)l(\ + g) as the average dividend-price ratio. 

Assuming too low a dividend-price ratio has the effect in his simula¬ 
tions of overstating the variability of price, by a factor of over three in 
the case of r * 5 percent. The fault in his procedure can be seen by 
reductio ad absurdum: for r * 3.382 percent the standard deviation of 
price would be infinite. Of course, in principle price could be much 
more volatile than dividends if r approaches g and the dividend-price 
ratio approaches zero, but that is hardly the route toward justifying 



3 


09 

< 

H 


i 

I 

i 

s 


> w 

!| 


Q sZ 



X 

a o 
* C 

*+ r «o 

3£~ 

s 


I 2 - 

s“ 


a * 

o 5 g 

p a 

<i JS 

o2 < > 

“ H u 

o?Q 


s ^ 

S 2 « 

( 2 s 



■™ ^ , 
*2 a si 



3 

■3 


I 


§ 

T3 

X 


.A 

.2 

“a 

< 

1 


9.431 

6.183 

5.057 

. 

9.103 

5.674 

4.845 


4.076 

3.917 

3.984 

8.341 

5.354 

4.478 


8.110 

5.085 

4.287 


3.629 

3.619 

3.601 

3.140 

3.319 

2.914 


2.816 

2.617 

2.658 


2.353 

2.355 

2.308 

17.431 

11.901 

9.526 

bo 

e 

1 

a 

i 

£ 

u 

13.894 

9.041 

7.383 

h. 

JZ 

s 

be 

w 

0 

1 

5.988 

7.219 

5.852 


2 


•c 



if 


* 

2 


.537 

.560 

.544 

c 

3 

§ 

s 

bo 

c 

00 CM Oft 
— <D CM 

0 

S 

be 

c 

■a 

-• 00 — 
— O 00 
■M* eft Oft 


s 

3 

*5* 


£ 

■o 

< 


— 

CO Oft Oft 

&> r* oft 

< 

CQ 

©> CM 4ft 
© O VO 
00 P CM 

u 

a 

978 

961 

977 

CM — — 

it 

CM — — 

<* 



<0 

U 


O 


CM Oft CM 

r* 00 t- 

~ ~ CO 


^ O © 
4ft ^ P 

ao to 


oft tO P 
COOO Mf 

Oft Oft Oft 

Oft CM 


oft Cm cm 


CM CM CM 

Obift 

Oft 4ft 
Oft 


H- 

601 

062 


00 P 

910 

897 

908 


867 

894 

914 


921 

930 

923 





§8 


4ft »ft 

■" t*- 


I 




logs JOURNAL OF POLITICAL ECONOMY 

the actual volatility of prices relative to dividends when the actual 
dividend-price ratio is around 5 percent. 

Part C of table 1 reports Monte Carlo results that are identical to 
those in part B except that p was varied with r so that (r - g)/( 1 + g) 
equals .05138 (the sample average dividend-price ratio for 1871— 
1978). The results in part C, column 2, show that for any of these 
discount rates, a'gross violation has a probability of less than 1 per¬ 
cent. 

The random walk case for log dividends assumed here is extreme: 
some other unit root models for dividends give much lower probabili¬ 
ties. 
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The Probability of Gross Violations of a 
Present Value Variance Inequality: Reply 

Allan W. Kleidon 

Stanford University 


In a recent critique of variance bounds tests (Kleidon 1986), I raised 
several issues concerning methodology, statistical procedures, and in¬ 
terpretation of results in papers such as Shiller (1981), and 1 con¬ 
ducted alternative tests of the hypothesis that prices can be regarded 
as the present value of expected future cash hows (see the abstract, p. 
953, and conclusions, pp. 995-98, for more details). I concluded that, 
because of various problems with arguments and techniques, there 
was little in Shiller to support his claim that “the tendency of big 
movements in [stock prices] to occur again and again" is sufficient to 
imply that the 1929 crash could not be a “rational mistake, a forecast 
error that rational people might make” (1981, p. 422). 

Much of my discussion dealt with the properties of rational stock 
prices and “perfect-foresight” prices pf for a cross section of different 
economies at a point in time versus time series for a single economy. 1 
In his comment, Shiller (this issue) responds to one argument (my sec. 
1IIA2), namely, that if prices are nonstationary, then unconditional 
variances are not defined and the properties of sample “variances” 
are unclear. I showed by simulation that if the procedures in Shiller 
are applied to geometric random walks with parameter values chosen 
to match those in his paper and Standard and Poor’s composite price 
index for 1926-79, then sample variances showing the “gross viola¬ 
tions” he reported are quite possible. 2 Shiiler’s response is to recom¬ 
pute these simuladons for different parameter values (and a slightly 


The rational price p, is defined as the present value of expected future dividends, 
while pf is defined as the present value of the ex post dividends, discounted at the same 
rate (or rates) as for the rational price. 

I also derived conditional variance bounds that are valid for the nonstationary 
processes under consideration and showed that they are not violated for Standard and 
Poor’s data. 
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different procedure for calculating a detrended pf series, which has a 
trivial effect), with the result that the frequency of apparent gross 
violations of the variance bound decreases relative to my table 2 (p. 
983). 

Although, as 1 discuss below, the procedures and parameter values 
used to construct table 1 in Shiller’s comment are open to question, 
the differences between my table 2 and his table 1 have little impact 
on the fundamental conclusions. It is true that different choices for 
parameter values can change the frequency of apparent gross viola¬ 
tions, but this serves to highlight the sensitivity of this statistical proce¬ 
dure to reasonable alternative assumptions. Further, even for the 
parameter values in the comment, the fundamental conclusion re¬ 
mains: given that the simulations ignore the complications of noncon¬ 
stant discount rates, dividend smoothing, and other problems such as 
the "peso problem" of rare events that did not occur in the sample, 
the results provide litde evidence of nonrational forces driving stock 
price changes. 

Now to details of the relevant simulations. Note that the geometric 
random walk model in these simulations assumes an identical process 
for dividends and prices, although as I discussed in some detail (pp. 
975-79, 993—95), this is neither necessary nor strictly correct for 
Standard and Poor’s price and dividend series. The approach I 
adopted was, first, to assume a rational geometric random walk for 
prices, second, to infer (nonunique) consistent dividend and earnings 
processes (which ignore dividend and potentially accounting earnings 
smoothing and nonconstant discount rates), and, third, to investigate 
whether these processes can produce the kind of gross violations of 
variance bounds reported in, say, Shiller (1981). Both my table 2 and 
Shiller’s table 1 show that such apparent gross violations are possible 
in simulated rational series, although Shiller finds a lower proportion. 

However, the procedures and parameter values used for Shiller’s 
table 1 are questionable. One issue concerns dividend yields versus 
capital gain rates. My simulations used real rates (r) of 0.05, 0.065, 
and 0.075. As I noted (p. 982), the first two were chosen to match the 
rates given in Shiller (1981) of 0.048 for detrended data and 0.063 for 
raw data, and 0.075 was given for comparison. In his comment, Shil¬ 
ler suggests that a more appropriate rate for the raw data in his 
original paper is 0.078. The capital gain rate 1 used was based on 
parameter values from Standard and Poor’s price series for 1926-79. 

Since returns comprise dividend yields and capital gains, a value for 
one component implies the complement for a given return. As Shiller 
correctly notes in his comment, I used a fixed capital gain rate of 
0.034 in my simulations, which had the effect of implying a changing 
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dividend yield component as the return r varied from 0.05 to 0.075. 
However, his approach in table 1 is not an obvious improvement: he 
holds the dividend yield fixed (at 0.051) across all r and varies the 
capital gain component to give the required return. Thus, for ex¬ 
ample, his simulations require a negative expected capital gain for r = 
0.05. s 

Further, it is by no means clear that 0.051 is the best rate to use in 
the simulations. First, the model assumes that 

1 + g 

Pt - T=J dtt (1) 

where g is the expected growth rate in prices and dividends, and 
Shiller sets the parameter a * (1 + g)/(r - g) equal to the inverse of 
“the mean actual dividend-price ratio” of 0.05138, or 19.46. If equa¬ 
tion (1) held exactly in actual data, there would be no differences 
from estimating a as Shiller proposes or by the alternative procedures 
of (a) the sample mean price-dividend ratio, (ft) the ratio of the sample 
mean price to the sample mean dividend, or (c) the ratio of price to 
dividend for any observation. Of course, (1) does not hold exactly in 
actual data, and so these alternative estimators give different esti¬ 
mates for a because of (aside from procedure c) Jensen’s inequality. 
The estimator Shiller uses gives a relatively low estimate of a, which it 
turns out is favorable to his case. 

Second, the sample mean dividend-price ratio is not robust to ap¬ 
parently small differences in data definition. For example, if one uses 
the definitions of LeRoy and Porter (1981), namely Standard and 
Poor’s prices at year end rather than an average over January and 
deflation by the GNP implicit deflator rather than by the producer 
price index, the mean dividend-price ratio for 1871-1979 is 0.046. 4 

However, although other comments could be made, I do not be¬ 
lieve that the matter deserves endless nit-picking. The point I origi¬ 
nally made in this regard (sec. I1LA2) is that if stock prices are non¬ 
stationary, then the results from simulations that ignore real-world 
complications such as nonconstant discount rates, dividend smooth¬ 
ing, and the peso problem show that there is little evidence from the 
theoretically meaningless comparison of sample “variances” of price 
and pf to support a claim that nonrational forces drive stock price 
changes. In my judgment, the point still stands. 


! Using eq. (1), iet djp, « 0.051 and r * 0.05, and solve for g. 

Incidentally, for these data, the mean price-dividend ratio is 23.30, illustrating 
Jensen's inequality discussed above. 
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Confirmations and Contradictions 


Urban Commuting Journeys 
Are Not "Wasteful” 

Michelle J. White 

University of Michigan 


Do urban workers commute too much? Bruce Hamilton (1982) was 
the first to raise the question whether urban workers’ commuting 
journeys are too long or, in his terms, “wasteful.” He argued that the 
monocentric urban model predicts that workers’ commuting journeys 
will be minimized. To test the model, he calculated the minimum 
commuting journey length for the average worker in a group of U.S. 
cities and compared the results to the actual average commuung jour¬ 
ney length for those workers. He assumed that any difference be¬ 
tween the two figures was “wasteful commuting.” He found that the 
average minimum commuting distance was only 1.1 miles, but the 
average distance actually commuted by workers in those cities was 8.7 
miles, or nearly eight times as great. Hamilton therefore concluded 
that the monocentric urban model has little predictive value concern¬ 
ing commuting behavior and that actual commuting behavior could 
be predicted just as well using an assumption that commuting is ran¬ 
dom. 

Commuting behavior is a central feature of any model that pur¬ 
ports to explain urban residential and job location choice. Hamilton’s 
assertion that the monocentric urban model has little predictive value 
concerning commuting behavior therefore strikes at the heart of 


lam grateful to the National Bureau of Economic Research for research support, to 
v uoT* 1 Amott, Ralph Braid, Paul Courant, Roger H. Gordon, Bruce Hamilton, Janet 
Kohlhase, Steve LeRoy, Janice Madden, and James Poterba for helpful comments on 
vinous drafts, and to John H. Miller for providing the computer program used in Sec. 
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modem urban economics. But Hamilton made such strong and inclu¬ 
sive assumptions concerning the definition of wasteful commuting 
that no city whose residents determine their locations by the postu¬ 
lates of economic rationality would be expected to satisfy them. In this 
paper I calculate new estimates of the average minimum commuting 
journey length in a sample of U.S. cities, using a more reasonable 
interpretation of what urban models would predict concerning loca¬ 
tion behavior by workers and firms. Comparing the resulting esti¬ 
mates of minimum commuting with data on the actual commuting 
journey length by workers in the same cities results in new estimates 
of the amount of wasteful commuting. For a sample of cities that 
overlaps Hamilton’s, 1 find that only around 11 percent of the actual 
amount of commuting in urban areas is wasteful. Thus waste in fact 
appears to be only a minor factor in explaining the commuting behav¬ 
ior of U.S. urban workers. 

Section I of the paper discusses the predictions of the monocentric 
urban models theory concerning commuting decisions by workers. 
Section II presents the assignment model approach used here to cal¬ 
culate new estimates of the minimum commuting journey predicted 
by the monocentric urban model. 

I. Predictions of the Monocentric Urban Model 
Concerning Commuting 

In the simplest monocentric urban model, all households have identi¬ 
cal tastes and have one worker, all workers have identical jobs and 
earnings, and all jobs are at the central business district (CBD). 
Households choose their residential locations by maximizing utility 
■functions subject to budget and time constraints. Commuting is as¬ 
sumed both to take time and to cost money. Residential locations are 
characterized by distance from the CBD, with the city assumed to be 
identical in all directions. Workers are willing to choose residential 
locations that involve longer commutes because housing prices fall 
with greater distance from the CBD. All commuting is on radial roads 
that are assumed to be ubiquitous. The average one-way commuting 
journey length therefore equals the average residential distance from 
the CBD. In the centralized employment model, which worker takes 
which job is irrelevant. 1 

Now introduce partial employment decentralization into the model 
but hold other assumptions unchanged. 8 Following Hamilton, I as- 

1 See Mills (1967) and Muth (1969) for development of the monocentric urban model. 

* Urban models that explore suburbanized employment include White (1976,1986), 
Ogawa and Fujita (1980), and Straszheim (1984). 
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sume that the spatial patterns of employment location and housing 
location are both fixed. When any jobs move out of the CBD, the 
pairing of individual workers' residential and job locations becomes 
important. The minimum possible average commuting journey 
length for workers in a city occurs if the following properties hold for 
all workers employed at suburban jobs: (1) workers’jobs are on the 
same ray from the CBD as their houses and (2) their jobs are closer to 
the CBD than their houses. If these two conditions are satisfied for all 
workers, then all commuting in the urban area will be in-commuting, 
that is, toward the CBD along a single ray during the home-to-work 
journey. There will be no out-commuting and no circumferential 
commuting, that is, no commutes that are away from the CBD during 
the home-to-work journey or that start on one ray from the CBD and 
end on another. Further, if all workers in the city commute inward; 
then the average commuting journey length of workers in the city will 
be minimized. 

Hamilton assumes that these two conditions are both satisfied for all 
workers in all cities when he calculates his estimates of the average 
minimum commuting distance in a city. He therefore assumes that all 
commuting in excess of the distance required for workers to commute 
inward to their jobs is wasteful. 3 However, in actuality, when firms 
move out of the CBD, they usually choose suburban locations that are 
concentrated at particular suburban subcenters. This causes subur¬ 
ban jobs to have a distribution around the CBD different from the 
distribuuon of workers’ residences. As a result, not all workers can 
commute inward to their jobs. Under what circumstances will workers 
choose to commute outward or circumferentially to suburban jobs, 
and what effect does this have on the average length of commuung 
journeys in the urban area? 

As an example, suppose that an arbitrary large firm (or group of 
firms) called firm A moves from the CBD to a suburban location one 
mile east of the CBD. All jobs in the urban area are now located either 
at the CBD or at firm A. Figure 1 shows the CBD of the urban area at 
the origin of a graph and firm A at (1, 0). The outer boundary of the 
city is the curve ced. Given firm A’s location, only workers that live 
more than one mile from the CBD and along the x-axis can commute 
inward to it. Workers are willing to commute inward to a suburban 
firm if it pays a wage equal to the wage at the CBD minus workers’ 
savings in commuting costs from working at the suburban firm. As¬ 
sume that the wage per day at the CBD is w* and that commuting 

* These assumptions are somewhat hidden by Hamilton’s use of negative exponential 
density gradients to represent the spatial patterns of jobs and housing. They require 
that the density patterns of jobs and housing be identical along afl rays from the CBD. 
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costs are y per mile round trip. Then firm A’s in-commuting wage is 
w* — y. At this wage, only workers that can commute inward to firm A 
will be willing to work there. Therefore, firm A’s in-commuting re¬ 
gion consists solely of workers living along the line segment Ae. 

However, if firm A is large, then it may demand more workers than 
those who live along the line segment Ae. Then to induce some work¬ 
ers to commute outward or commute circumferentially to the firm, it 
must raise its wage above w* - y. Suppose that firm A offers a wage 
of a/', which is above the in-commuting wage but below the CBD 
wage, or w* - 7 < w' < w*. Also suppose that the urban area has 
straight-line roads connecting all residences to all workplaces. Finally, 
assume that all households in the urban area consume the same 
amount of housing . 4 

The rise in wages paid by firm A increases the size of the commut¬ 
ing region from which workers are willing to commute to firm A. 
Workers are indifferent between commuting to two different job lo¬ 
cations when the wages at both job locations minus commuting costs 
are equal. Therefore, the equation defining the boundary of firm A’s 

*- Urban models typically assume that the amount of housing consumed rises with 
distance from the CBD. Making this assumption would not change the general result* 
obtained here. 
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commuting region is 

w* - yix* + y a ) v » = w' - <y[(x - l) 2 + y 2 ]*, (1) 

where x andy are the coordinates of a worker’s residential location on 
the graph of the urban area, (x 2 + y 2 )'* is the distance between point 
(x,y) and the CBD, and [(x - l) 2 + y 2 ]* is the distance between point 
( x , y) and point A. 

Firm A’s commuting region is shown in figure 1 by the shaded area 
enclosed by the line cbde. The inner boundary of the region is at point 
b, where a worker living along the x-axis is indifferent between com¬ 
muting inward and outward. Increases in w' relative to to* cause firm 
A’s commuting region to increase in size, while decreases in w' rela¬ 
tive to w* - y cause firm A’s commuting region to decrease in size 
and to collapse eventually to the line segment Ae. The larger area in 
figure 1 enclosed by the dashed line shows firm A’s commuting region 
if its wage rose but remained below the CBD wage. 5 

Thus the sizes of commuting regions for firms at different locations 
are determined by workers choosing job locations to maximize their 
wages net of commuting costs. But when workers choose jobs by this 
criterion, the total amount of commuting by workers in the urban 
area (and the average commuting journey length) is minimized given 
the fixed spatial pattern of jobs and housing. This is the link that 
emerges in the urban models literature between individual workers’ 
optimizing behavior and minimization of the total amount of com¬ 
muting in an urban area. 6 But workers are assumed to choose job 

5 The outer boundary of firm A’s commuting region actually will bulge out sligfady 
beyond the circle of radius 0*. Note that the shape of firm A's commuting region would 
be similar if the more realistic assumption were made that housing consumption rose 
with greater distance from the CBD. In that case, the boundary of firm A’s commuting 
region would be determined by the condition that households living along the bound¬ 
ary must achieve the same level of utility if their workers commuted to the CBD vs. to 
firm A. 

6 In the example in fig. 1, this implies that the average commuting journey length by 
all workers in the city is minimized when workers living in the shaded area choose jobs 
at A and all other workers choose jobs at the CBD. To show this, suppose that a worker 
lives at an arbitrary point (x tl y t ), which is not in the shaded region in fig. 1, and 
commutes to the CBD. Another worker lives at (x*, y%), which is in the shaded region, 
and commutes to firm A. If total commuting is minimized by workers in the shaded 
region commuting to firm A and workers not living in the shaded region commuting to 
the CBD, then commuting by the two workers together must increase if they switched 
J“ bl , Thi * implies that (xf + y?)** + ((*, - l)* + yjj* <(x| + yj) w + [(x, - 1)* + yfl*. 
Each worker decides where to work by choosing the job location where wages net of 
commuting costs are highest. For the worker at (xj, ji) to choose a job at the CBD, the 
wage net of commuting costs must be higher there than at firm A, or tv* - y(*i + yf)” 

- *rt(x, - 1)* + yTj'*. Similarly, for the worker at (x,,y,) to choose a job at firm A, 
u must be the case that w' - y[(x* - 1)* + yi] w > w* - y(x? + y})*- But if the two 
inequalities defining workers' job location choices are added together, the resulting 
^ fwession is die condition that the two workers minimke the sum of their commuting 
a *® anc **- The same argument can be made for any pair of workers in the city. 



UOS JOURNAL OF POLITICAL ECONOMY 

locations taking the spatial pattern of workplaces as Axed. If the pat¬ 
tern of commuting that results when workers choose job locations to 
maximize net earnings includes out-commuting or circumferential 
commuting, then there must be more commuting in total and on 
average in the urban area than there would be if the spatial pattern of 
job locations could be rearranged so that all workers could commute 
inward to their jobs. 

Thus when predictions concerning commuting behavior are devel¬ 
oped using the monocentric urban models approach, the assumption 
that workers maximize earnings net of commuting costs is enough to 
assure that the total amount of commuting and the average commut¬ 
ing journey length in the urban area will be minimized. But workers 
make their decisions subject to the Axed spatial pattern of job loca¬ 
tions in the urban area. They therefore may end up commuting out¬ 
ward or circumferentially to a suburban subcenter. Further, workers 
make their decisions subject to the constraint that they travel via the 
existing transportation network in the city. These two factors cause 
the minimum amount of commuting by aH workers in the city and the 
minimum average commuting journey in the city to be higher than 
they would be if all workers could commute inward or if all com¬ 
muting trips could take place along ubiquitous, straight-line roads. 
But the postulates of rational behavior by urban workers cannot go 
so far as to require that workers do the impossible: commute only in¬ 
ward when the spatial pattern of job locations requires some out- 
commuting or commute along only straight-line routes when the ac¬ 
tual road network is a grid pattern or a series of former cow paths. 
The urban monocentric model should not be interpreted to require 
that workers do more than choose rationally with respect to the exist¬ 
ing spatial pattern of jobs and the existing transportation network. 
Only commuting that exceeds this amount should be counted as 
“wasteful.” 

One further issue is whether it is reasonable to assume that Arms 
actually locate in spatially concentrated suburban subcenters (such as 
point A in Ag. 1) or whether proAt-maximizing Arms would actually 
choose locations that are spread out uniformly around the CBD so as 
to allow workers to commute inward. Clearly there is an incentive for 
Arms to spread out uniformly around the CBD since by doing so they 
can save the extra wage payments necessary to induce workers to 
commute outward or circumferentially. In general, Arms choose loca¬ 
tions within a metropolitan area by a process of cost minimization. By 
moving to die suburbs, they save on the cost of land (since the price of 
land falls with greater distance from the CBD), they also may save on 
the costs of transporting inputs and outputs by avoiding CBD conges¬ 
tion, and they save on workers’ wages, with the amount of savings 
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depending on the extern to which their workers must commute out¬ 
ward or circumferentially. The costs of production also may be lower 
in the suburbs for some firms, such as manufacturing firms that 
realize cost savings from having large sites to accommodate horizontal 
assembly lines. Titus we expect that at least some firms will choose 
suburban locations even though they are large and need work forces 
that must commute outward or circumferentially. In general, firms 
making location decisions minimize a broader set of costs than just 
workers’ commuting costs. 7 

II. New Estimates of the Amount of Waste 
in Urban Commuting 

We have shown that Hamilton’s measure of wasteful commuting actu¬ 
ally includes three separate factors leading to extra commuting. 
These are (1) differences between the spatial distributions of jobs and 
residences around the CBD, which are caused by concentrations of 
employment at suburban subcenters; (2) the fact that the actual road 
network is not ubiquitous, so that commuting journeys do not pro¬ 
ceed along straight-line routes; and (3) the existence of commuting 
trips that could be shortened if workers trade jobs or residences, 
thereby reducing the total amount of commuting in the metropolitan 
area. (This latter will be referred to as “cross-commuting.”) Of these 
three sources of extra commuting, only the third should be counted 
in determining the amount of wasteful commuting because only 
cross-commuting can be eliminated if workers trade jobs or houses 
given the fixed spatial pattern of workplaces and residences. But 
Hamilton’s own method includes all three. 

A new method of calculating the average minimum commuting 
journey length using an assignment model enables us to separate out 
the amount of extra commuting actually due to cross-commuting 
from that due to the first and second factors of the three listed above. 
The 1980 Census of Population (subject report, Journey to Work: Charac¬ 
teristics of Workers in Metropolitan Areas [sec. 1]) divides metropolitan 
areas into political jurisdictions including the legal central city, subur¬ 
ban towns having populations of 25,000 or more, the remaining parts 
of suburban counties, and sometimes entire suburban counties. For 
each jurisdiction, data are given on how many workers live in that 
jurisdiction and commute to workplaces in each of the other jurisdic- 


For the 49 largest U.S. metropolitan areas, only 8 percent of jobs are located m the 
CBDs and 48 percent of jobs are located in the legal central cities. This suggesuthat 
*wn some large firms tnust have found it profitable to choose suburban locations. 
*“" ou gh they mutt pay their workers extra to commute outward. 
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lions within the same metropolitan area. Also, the average time spent 
commuting is given for workers who live in each jurisdiction and 
commute to workplaces in each of the other jurisdictions. For pur¬ 
poses of the workplace breakdown, the legal central city is divided 
between jobs in the CBD and jobs in the rest of the central city. 8 

Suppose that a metropolitan area has residential jurisdictions, de¬ 
noted i, that include the legal central city and its suburban jurisdic¬ 
tions. The same metropolitan area has separate workplace jurisdic¬ 
tions, denoted j, that include the CBD, the remainder of the central 
city, and the same set of suburban jurisdictions. There are / separate 
residential jurisdictions and / separate workplace jurisdictions, where 
J - 1 + l. 9 The number of workers who live in jurisdiction i and 
commute to workplaces in jurisdiction j is denoted n*. A matrix hav¬ 
ing dimensions / by / and elements tiy is constructed of the number of 
workers who commute from any residential location to any workplace 
location. 

The total number of workers living in the tth jurisdiction is denoted 
N„ where AT, = X, ity. The total number of workers living anywhere in 
the metropolitan area is denoted A\ where N — 2, N t . The total num¬ 
ber of workplaces in the jth jurisdiction is denoted Af ; , where Af ; = 
X, riij. The total number of workplaces anywhere in the metropolitan 
area is denoted M, where Af = 2, Mj. Workers who live in the stan¬ 
dard metropolitan statistical area but work outside it and workers who 
live outside the area but work in it are excluded from the analysis. 
Therefore, the total number of workers must equal the total number 
of jobs in the metropolitan area, or N = Af. 10 

A matrix of actual commuting times having dimensions I by/ is also 
constructed. Its elements are denoted Average actual commuting 
time for workers in the metropolitan area, denoted f, is the weighted 
sum of the matrix of commuting times, with weights equal to the 
proportion of workers in the metropolitan area commuting from ju¬ 
risdiction i to j. Thus l ~ 2,2, tyfiijlN. 


8 Commuting time is used here rather than commuting distance to measure work¬ 
place-residence separation for convenience reasons, since the census does not give 
distance data, and because time spent commuting is a better measure than distance of 
the cost of commuting, since it is workers' time that is scarce and is economized on. 
Also, independent evidence suggests that workers who commute further tend to travel 
at considerably higher average speeds. Cherlow and Morgan (1976) found that a group 
of workers who commuted less than 6 miles had an average speed of IS mph, while a 
group of workers who traveled 11 miles or more had an average speed of 34 mph. The 
average distance traveled of the former group was 2.6 miles and that of the latter group 
was 21.2 miles. Thus an eightfold increase in distance was associated with only a 
threefold increase in commuting time. 

9 Occasionally, a metropolitan area has two CBDs. Then ,/ * / + 2. 

w Workers who are unemployed or who do not report a fixed place of work are 
excluded. 
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We wish to construct a figure for the average minimum co mm uting 
time in the metropolitan area that corrects for cross-commuting but 
not for extra commuting due to the differing spatial patterns of jobs 
and housing or due to the actual road network. Assume that the 
actual spatial patterns of jobs and housing are represented by the 
numbers of jobs and residences located in each jurisdiction, or by 
the vectors N t and M,-, which are assumed to be fixed. Also assume that 
the actual road network is represented by the matrix of actual average 
commuting times between jurisdictions, whose elements are t f> . These 
values are also assumed to be fixed. Then we can determine the 
average minimum commuting time figure by solving for the assign¬ 
ment of workers to jobs that minimizes the total time spent commut¬ 
ing by all workers in the metropolitan area. 

The optimization problem thus solves for a new matrix of worker- 
to-job assignments that minimizes the total time spent commuting by 
all workers in the metropolitan area. Suppose that the elements of this 
matrix are denoted «*. The optimization problem is then 11 

min Z * XX fyn* (2) 

* i 

subject to the constraints 2, = Af >( 2, n* = N t , and n* at 0. The 

solution matrix is then used to solve for the minimum average 
commuting time in the metropolitan area, which is denoted t. It is f * 
2 , 2 , tijnf/N. 

The difference between the average actual time spent commuting, 
l, and the average minimum time spent commuting, t, is cross- 
commuting, which could be eliminated if workers trade residences or 
jobs. This corresponds exacdy to Hamilton's definition of wasteful 
commuting, except that our procedure has eliminated the actual road 
network and the differing spatial patterns of jobs and housing as 
additional contributors to the measured amount of wasteful commut¬ 
ing. The proportion of commuting that is wasteful in these calcula¬ 
tions is expected to be smaller than that found by Hamilton because it 
eliminates these two additional sources of extra commuting. 

In choosing cities, I included all the cities studied by Hamilton, for 
purposes of comparison with his study, plus several cities chosen be¬ 
cause they contain large numbers of separate jurisdictions. The re¬ 
sults are given in table 1. Columns 1 and 2 give the actual average and 
minimum average commuting times for workers in each metropolitan 
area. Column 3 gives the proportion of commuting that is wasteful, 
equal to (? - f)/f. 

The results are quite striking. They suggest that there is little waste 

1 The Algorithm wed hens is taken from Sysio, Deo, and Kowslik (1985). 



TABLE 1 

Average Actual and Minimum Commuting Journey Lengths and Wasteful Commuting 
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in commuting behavior by urban workers. The average commuting 
journey length is 22.5 minutes, the average minimum commuting 
journey is 20.0 minutes, and the average proportion of commut¬ 
ing that is wasteful is .11. The number of minutes added to the aver¬ 
age commuting journey by cross-commuting is 2.5. 

Column 4 of table 1 gives Hamilton’s results for the proportion of 
commuting that is wasteful, measured in the same way as in column 3, 
for those cities included in his study. 12 Measured in this way, his 
average proportion of commuting that is wasteful is .87, or eight 
times the results obtained here. Since both sets of figures for wasteful 
commuting include commuting that could be eliminated if workers 
trade jobs or residences, a comparison of Hamilton’s results with mine 
suggests that the influence of the actual road network and of the 
differing spatial patterns of jobs and housing, which add to Hamil¬ 
ton’s calculations of waste in commuting but not to the results pre¬ 
sented here, is much more important than the influence of cross- 
commuting in explaining Hamilton’s results. If these factors are 
eliminated, then the amount of waste in the urban commuting pat¬ 
tern falls to a small proportion of actual commuting. 

In order to explain why my results for the amount of wasteful 
commuting are so low, it is of interest to examine the characteristics of 
the actual spatial pattern of jobs and housing more closely. Column 6 
of table 1 gives the average ratio of jobs to houses in suburban juris¬ 
dictions for each of the metropolitan areas in the sample. 13 The aver¬ 
age value for the sample of cities is 0.80. Thus the majority of workers 
who live in any suburban jurisdiction can take a job in the same 
jurisdiction if they choose. 14 Further, the characteristics of the road 
network suggest that the worker has a strong incentive to do so. A 
general characteristic of the commute time matrix is that the average 
time required for commuting journeys that begin and end in the same 
jurisdiction is smaller than for any journey that crosses jurisdiction 
lines; that is, is minimized for t = j. Typically, commuting time is 
next lowest for journeys to a few nearby jurisdictions and rises steeply 
for journeys to the CBD and to nonadjacent suburban jurisdictions. 15 

'* The figure given here is Hamilton'* "mean actual commute" minus his “mean 
optimum commute,” divided by the former, or (col. C - col. D)/col. C of hi* table 1 
(1982. p. 1041). 

18 This figure it the simple average of M/N, for all jurisdiction* included in the 
commuting calculations except the legal central city (or cities). 

Given the degree of aggregation in the data, this figure is Still consistent with 
substantial differences in the spatial distributions of jobs and housing. However, the 
figures do suggest that the stereotypical "bedroom suburb" is a disappearing phcnome- 
non. 

" The. average sridun-jurisdktxai commute for each metropolitan area, excluding 
commutes within the central city, ranges from 11 to 17 minutes m the group ot met- 
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The low wasteful commuting results suggest that the number of 
workers who commute between nonadjacent suburban jurisdictions 
(which are the journeys most likely to be wasteful) is small. 

Variation in the number of jurisdictions on which the calculations 
of wasteful commuting for each metropolitan area are based presents 
a potential source of bias in the calculations. (The number of separate 
workplace jurisdictions for each metropolitan area is given in table 1, 
col. 5.) As indicated, normally the length of the average commuting 
journey within a jurisdiction is shorter than the length of any com¬ 
muting journey that crosses jurisdiction lines. This means that any 
commuting journey within a jurisdiction is already efficient and will 
not be changed by the assignment procedure. 16 In contrast, commut¬ 
ing journeys that cross jurisdiction lines have greater potential to 
differ in the optimal versus the actual assignment of workers to jobs. 
But as the number of jurisdictions in a metropolitan area falls, their 
size increases and the proportion of commuting journeys that are 
within jurisdictions rises. Therefore, the observed pattern of com¬ 
muting will be closer to the optimal assignment pattern. This suggests 
that the proportion of commuting that is wasteful may tend to fall as 
the number of jurisdictions falls. 

Therefore, increases in the number of jurisdictions may have a 
positive effect on the measured amount of wasteful commuting. How¬ 
ever, this effect should decrease in importance as the number of 
jurisdictions rises. One reason for this is that the longest outward or 
circumferential commuting journeys, such as a journey from a resi¬ 
dence near the CBD to a job in the outer suburbs, would tend to be 
eliminated in the optimal assignment regardless of whether there are 
many separate jurisdictions or few. In addition, as the number of 
jurisdictions rises, their average size gets smaller. This means that 
some commuting journeys that were previously efficient because they 
were within a jurisdiction become wasteful when the number of juris¬ 
dictions rises because they extend between two adjacent jurisdictions 
and are outward or circumferential. But as the number of jurisdic¬ 
tions rises, the time difference between the length of the average 


ropolitan areas studied, with an average value of 13 minutes for the sample. The 
average time spent commuting to the CBD of each metropolitan area from all jurisdic¬ 
tions ranges from 24 to 58 minutes in the sample, with an average value of 38 minutes. 
The longest commuting journey in each commuting time matrix for the group of ci tie* 
studied ranges from 35 to 99 minutes. The average of the maximum values is 6o 
minutes. 

16 The solution matrix to the assignment problem typically specifies that workers 
living in a particular residential jurisdiction commute only to jobs within that jurisdic¬ 
tion .or to jobs in one or two other jurisdictions, usually the CBD or an adjacent 
suburban jurisdiction. All other entries in the solution matrix are tiro. Thus long 
commuting journeys, except those to the CBD, are almost always eliminated. 
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commuting journey within a jurisdiction and the length of the aver¬ 
age commuting journey between adjacent jurisdictions falls. 17 Thus 
while the assignment problem may rearrange more commuting jour¬ 
neys when there are more—but smaller—jurisdictions, the gain in 
terms of reduced commuting when workers trade jobs or houses is 
likely to be smaller. Therefore, holding other factors constant, we 
expect that as the number of jurisdictions rises, the calculated amount 
of wasteful commuting may rise, but at a diminishing rate. 

This suggests that if wasteful commuting were measured with a 
data set having more jurisdictions, the proportion of commuting that 
would be found to be wasteful by the assignment method would prob¬ 
ably rise somewhat. However, given the fact that the method used 
here to measure wasteful commuting corrects for the effects of the 
differing spatial patterns of jobs and housing and for the actual road 
network, it seems unlikely that the amount of wasteful commuting 
would rise by nearly enough to dominate the actual commuting 
figures, as Hamilton found. 18 

The results presented here suggest that monocentric urban models 
are in better shape than Hamilton's gloomy diagnosis would imply but 
that further research is still needed to explain why some urban work¬ 
ers voluntarily choose “wasteful” commuting trips. One factor likely 
to be important in explaining these choices is the increasing preva¬ 
lence of two-worker households. Workers in these households must 
commute to two different workplaces from a single residence. This 
makes it more likely that one or both workers will choose an outward 
or circumferential commute. However, the assignment model treats 
these workers as though they lived in separate households and it 
assigns them to residences in different jurisdictions if doing so will 
reduce the average commuting journey length in the urban area. 
Another such factor is the proportion of black and minority workers 
in the urban labor force. To the extent that these workers face dis¬ 
crimination in either housing or job markets, they have more re¬ 
stricted choices of housing and job locations than white workers. They 

17 U jurisdiction* were square and the commuting journeys between adjacent juris- 
dieuons were always from the center of one to the center of the other, then halving the 
dimensions of each jurisdiction would imply a fourfold increase in the number of 
md **h^ >n * ^ Ut **** k n gdi commute * between adjacent jurisdictions would drop by 

■ . J°*« whether the relationship between wasteful commuting and number of 
jurisdiction* diminishes with more jurisdictions, 1 regressed the proportion of commut- 
*“* that is wasteful on a constant term, the number of jurisdictions, and on a variable 
equal to die number of jurisdictions minus 10, if this was nonnegative, or rise zero. The 
*»mpte was the 28 cities shown in table 1. Both variables were significant at the 95 
percent level The implied slope of the relationship between the proportion of com- 
KSP * wasteful and the number of jurisdictions was .0115 for 10 or fewer 
JvnwJictkmt, but only ,0023 for more than 10 jurisdictions. 
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therefore are more likely to end up commuting outward or circum¬ 
ferentially because they cannot shift residential locations to commute 
inward to suburban jobs or shift job locations to accommodate cen¬ 
trally located residences. Thus cities having more two-worker house¬ 
holds or more black and minority workers are likely to have more 
wasteful commuting. Further research probing the roles of these and 
other factors in causing some workers to choose wasteful commutes is 
clearly needed. It is hoped that future urban economists will not have 
to characterize any commuting behavior as wasteful and instead will 
be able to explain it. 
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Th« Case of the Negative Nominal Interest 
Hates: New Estimates of the Term Structure 
of Interest Rates during the Great Depression 


Stephen G. Cecchetti 

Ohio State University and National Bureau of Economic Research 


Throughout the 1930s and early 1940s, U.S. Treasury bonds and 
notes appeared to have negative nominal yields as they approached 
maturity. But negative nominal interest rates are impossible in a 
world in which one can always hold cash. The resolution to this 
puzzle is that Treasury securities, in addition to making coupon 
payments, gave the-owner the right to buy a new security on a future 
date. This paper describes the institutional environment that led to 
the apparent negative nominal interest rates, develops a method for 
valuing the “exchange privilege,” and computes accurate measures 
of the yield to the coupon-bearing component of these composite 
bond/options. These corrected bond and note yields are then used to 
calculate new estimates of the term structure of interest rates from 
1929 to 1949. 


I. Introduction 

On December 31, 1932, the New York Times listed the yield on a 3.5 
percent U.S. liberty bond as -1.74 percent. This seems impossible. 
An investor can always hold cash rather than an interest-bearing secu¬ 
rity, so any bond should have a positive nominal yield. It is well known 

Thi» paper is a revised version of NBER Working Paper no. 2472 (Cecchetti 1987). 
Much of the work reported in this paper was completed while I was a visiting scholar at 
the Federal Reserve Bank of Kansas City. Thanks are due to Bob Cumby for discus- 
“ons and aid beyond the call of duty; Rick Mishkin for numerous conversations; Bob 
Barsky, Emie Bloch, Hobart Carr, Ben Friedman, Ed Kane, Peter Temin, Paul Wach- 
Eugene White, and the participants in the 1987 NBER Summer Institute fordiscus- 
*»ns and comments on an earlier draft; and Ellen Nose, Raj Mital, Tom Dean, and 
C " uc k Larson for untiring research assistance. 
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that during the Great Depression the prices of Treasury bills at auc¬ 
tion occasionally exceeded par. But the negative yields were ex¬ 
tremely small, on the order of - 0.05 percent. Yields of this small a 
magnitude can be explained by the fact that Treasury bills were ex¬ 
empt from persona] property taxes in some states (see Homer 1977, 
p. 355) and that Treasury securities were required as collateral for a 
bank to hold U.S. government deposits. 8 Negative nominal yields on 
the order of -2 percent are an entirely different story. In fart, from 
mid-1932 through mid-1942, the vast majority of coupon-bearing 
U.S. government securities bore negative nominal yields as they 
neared maturity. 9 

Since negative nominal yields are impossible in a world in which 
one can always hold cash, these securities must have had other attri¬ 
butes that were being valued. During the 1930s, the standard practice 
of the U.S. Treasury was to issue new bonds with coupon rates that 
implied market prices above par, but to sell them at par. Holders of 
maturing bonds and notes were given preferential treatment in the 
distribution of these new issues. Coupon-bearing Treasury securities 
had what was called an “exchange privilege.” At maturity, they could 
be exchanged at par for a new issue. Government bonds and notes 
were not just coupon securities; they were options as well. 4 The option 
had value that was included in the quoted price. As a bond ap¬ 
proached maturity, this premium caused the price to rise high 
enough that the computed yield was negative. 

The solution to the first puzzle, that of the negative nominal inter¬ 
est rates, has given way to a second one: Why did the Treasury sell 
new issues at prices below those prevailing in the market? The answer 
to this question can be found by studying the institutional environ¬ 
ment of the 1930s. Legal constraints forced the Treasury to sell new 
securities at par. To ensure that an offering actually sold, the coupon 
rate had to be set above the current market interest rate. Initial pur¬ 
chasers were paid to place the new issue. This was the method of 
underwriting. 

The purpose of this paper is to describe the conditions that led to 
the apparent negative nominal interest rates and then use this infor- 

1 Bids in excess of par were received throughout 1939, 1940, and 1941. The highest 
recorded was 100.018 on January 8, 1941. See the Annual Report of the Secretary of the 
Treasury (1941, p. 301). 

* I have also heard the claim that banks substituted Treasury bills for currency of 
smaller denomination in making interbank transfers, and so the negative yidd reflected 
convenience. Unfortunately, this could not be substantiated. 

*The plots in both Durand (1942) and the U.S. Treasury Bulletin for 1939 imply 
negative nominal yields for maturities below 2 years. Childs (1947, p. 259) also notes 
the existence of negative nominal yields in the 1930s but provides no explanation. 

* These options were very similar to the quality or delivery option associated wttn 
Treasury bond futures. See Figlewski (1986, p. 31) for adescripiioo. 
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■nation to construct accurate data on the returns to holding U.S. 
government securities during the 1930s and 1940s. Proper computa¬ 
tion of the term structure during the 1930s requires careful examina¬ 
tion of the institutions of the bond market and Treasury debt man¬ 
agement. In what follows, a method for. valuing the exchange 
privilege is described and used to correct the measurement of the 
yields of traded securities. These are used to construct term structure 
estimates from 1929 to 1949 that are consistent with those currently 
in use. These new data replace the sketchy data contained in the 
Federal Reserve Board’s Banking and Monetary Statistics of the United 
States and for the first time allow one to follow changes in the shape of 
the term structure during the Great Depression. The interest rate 
data can be added to new data on 3- and 6-month time loans in 
Mankiw and Miron (1985) and the new output, production, and un¬ 
employment data in Romer (1986a, 19864, 1988, 1989). 

There are two motivations for constructing this new data set. First, 
empirical research in macroeconomics often relies on the use of 
lengthy time-series data. 5 While Salomon Brothers (1985) publishes 
estimates of yields at 3 months, 1, 2, 3, 4, 5, 10, and 20 years to 
maturity beginning in 1950, 6 data on the term structure of interest 
rates prior to 1950 are noticeably missing. Second, the resurgence of 
interest in the economics of the Great Depression 7 makes it all the 
more important to exploit new data sources. 

The remainder of this paper is divided into four sections. Section II 
describes the raw data collected and used in the study. Section III 
provides a detailed account of the Treasury practices that caused 
nominal interest rates to be negative. The rationale for the Treasury’s 
behavior is also examined. A method for valuing the exchange provi¬ 
sion is then proposed and used to compute the yield to the coupon- 
bearing component of the composite bond/option. Section IV uses 
these corrected yields to construct estimates of the term structure 
using a technique derived by Nelson and Siegel (1985). The conclud¬ 
ing section (Sec. V) provides a comparison of the new interest rate 
series with those previously available and finds that there are substan¬ 
tial differences. The adjustments for the exchange privilege lead to 
systematically higher estimates of yields at maturities below 5 years. 

Thii is true of the original work on business cycle dating summarized in Moore and 
iarnowitz (1986) and the more recent studies of the effects of money by Friedman and 
# » an * and investment by Gordon and Veitch (1486). 

Recently, McCulloch (1987) has estimated coupon-corrected yield curves for De¬ 
cember 1946 to February 1987 that will likely replace the Salomon data in future 
'•'search. 

7 Pkperi by Bemanke (1983, 1986), Field (1984), Bemanke and Powell (1986), and 
amihon (1987) and the Assays in Brunner (1981) are examples. 
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It Data 

Existing data on nominal interest rates prior to World War II are both 
limited in scope and imprecise. The Federal Reserve Board’s Banking 
and Monetary Statistics of the United States contains several series for 
interest rates during the interwar period, but it is difficult to tell 
exactly how the numbers were constructed and what securities they 
actually refer to. For example, table 122 on page 460 of Banking and 
Monetary Statistics includes monthly series for 3-5-year tax-exempt 
Treasury notes, while table 128 on page 468 reports longer-term 
bond yields under the simple heading “U.S. Government.” The sec¬ 
ond of these refers to the unweighted average of the yield on all 
outstanding bonds with at least 12 years tp maturity. Clearly, there is 
motivation for collecting a new and more complete set of interest rate 
data. 

Construction of a new data set on the term structure requires infor¬ 
mation on the prices of outstanding Treasury issues. These raw data 
were collected from the New York Times financial column entitled 
“Bond Sales on the New York Slock Exchange.” Quotes on the prices 
of all U.S. Treasury bonds, notes, and certificates on indebtedness 
were collected from the New York Times for the final trading day of 
each month from January 1929 to December 1949. The data set is 
complete in that it contains a yield for every bond, note, and 
certificate for every month during which it was in existence. It is 
composed of all 152 coupon-bearing securities either in existence in 
January 1929 or issued during the 21-year period examined. Of this 
total, 56 are bonds, 54 are notes, and 42 are certificates of indebted¬ 
ness. 

In addition to coupon securities, beginning in mid-1931 data were 
collected on the yield of Treasury bills with 3 months to maturity; 
prices are not reported. 8 As is currently the case, Treasury bills were 
pure discount securities. Other Treasury bills of shorter maturity 
were excluded since the major objective is to study yields at longer 
maturities. 9 

As is nearly always the case in research on financial markets, the 
data refer to dealer price quotes. There is no guarantee that actual 

* Childs (1947, p. 432) describes early Treasury bill issues. While the first Treasury 
bills were issued in 1929, it was not until 1931 that a series can be constructed that is 
composed solely of issues with 3 months to maturity. During 1929 and 1930, bills were 
issued at irregular intervals and matured in 3, 6, 9, or 12 months. Three-month Trea¬ 
sury bill rates were found for February, April, and May 1931 as well as every month 
beginning with July 1931. 

9 In addition, all interest-bearing government debt not issued directly, by the 
Treasury, such as securities issued by the Federal Home Loan Bank Board or the 
Reconstruction Finance Corporation, is omitted. 
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transactions occurred at these prices. This problem is minimized by 
computing yields based on the mean of the bid/ask spread. But it is 
impossible to know how large an error comes from systematic differ¬ 
ences between dealer quotes and transactions -prices. 

It is possible, however, to ensure that trading occurred. The New 
York Times does report volume. For example, on January 30, 1932, 
volume in the 3.375 percent Treasury bonds of 1940-43 amounted to 
$130,000. While this is a very small fraction of the nearly $360 million 
of the issue outstanding, it is important that there was some trading. 
To make the data set complete, in several isolated cases it was neces¬ 
sary to use price quotes that did not reflect trading on the New York 
Stock Exchange. These quotes were found in the New York Times 
under the heading “U.S. Bond Quotations—Closing Quotations for 
Issues Not Traded in on [sic] the Stock Exchange Yesterday.” 

Since the majority of U.S. Treasury bonds issued during this period 
contained call provisions, there is a problem in computing the yield to 
maturity. Fortunately, except for several very special cases, all bonds 
were called on the first allowable date. As such, all yields were com¬ 
puted to the call date. 10 

The raw data set consists of 9,070 observations over 252 months, or 
just under 36 observations per month, on average. These raw data are 
available from the author on standard diskettes. As one would expea, 
the number of observations is small during the first few years, increas¬ 
ing substantially with the debt issues of the middle 1930s and again 
with the issues during World War II. In 1929, 1930, and 1931, there 
is an average of only 14 data points per month. By 1933, the average 
is over 20, rising steadily to 40 in 1939 and to 54 in 1945 and falling to 
38 in 1949. The implication is that the estimated yield curves will be 
less accurate for the earlier period simply because of the paucity of 
data. 


UI. Negative Nominal Yields and the Exchange 
Privilege 

Consider the following exercise. Take the data described in Section II 
for a representative month and compute the yield to maturity for all 
the coupon-bearing securities based on the mean of the bid/ask 
spread. The results for February 1935 are plotted in figure l. 11 In the 


An alternative is to compute the yield to the call date when the price of the security 
exceeds par and the yield to the final maturity date when the price is below par. Use of 
was rule would have virtually no effect on the results since bonds nearly always sold at 
PT h W **! par. 

< hv*m wfi constructed for every month of the data set. From M>34 to 
=* 1 . ah the figures had die same general features as fig. 1. 
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figure, N "s refer to fully tax-exempt securities and P’s refer to par¬ 
tially tax-exempt securities. As is discussed in detail in Cecchetti 
(1987), this distinction is largely irrelevant prior to 1940 but impor¬ 
tant beginning in 1941. 12 The solid line is an estimate of a term 
structure using the technique described in Section IV. (Following the 
standard convention, all interest rates are in bond yield equivalents: 
two times the 6-month rate.) 

Figure 1 has several striking features. First, except for the single N 
representing the 3-month Treasury bill yield of 0.15 percent, the 
yield curve is smoothly upward sloping. If one were to neglect the 
vertical scale, the picture would not seem odd. The problem is that 
the lowest point is a Treasury note with 5 months to expiration and a 
yield of -1.25 percent. If this result were obtained for an isolated 
month, one would be inclined to check the raw price data for errors. 
But negative yields arise consistently from 1932 through 1942. 

Discussions of the period note the existence of negadve nominal 
yields. They point out that during the 1930s the standard practice of 
the U.S. Treasury was to issue new bonds above par and give holders 
of maturing bonds, notes, and certificates preferential treatment in 
distributing the new issue. Maturing securities had an “exchange 
privilege” that gave them added value. 

The remainder of this section is divided into two parts. The first 

12 Prior to 1941, the partial tax exemption had the effect of making the interest on 
bond* fully taxable to individuals but tax exempt to corporations. 
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provides a discussion of the institutional environment that led to the 
apparent negative nominal yields and discusses the reason for the 
Treasury to issue securities in the way that it did. This is followed by a 
detailed description of how to correct the data for the existence of die 
exchange privilege. 

A. The Exchange Privilege 

Each year, the Annual Report of the Secretary of the Treasury describes 
the offerings of securities during that year. In the 1930s, new offer¬ 
ings were announced from 1 to 2 weeks prior to the date of issue. The 
announcement stipulated the method of payment. The purchaser was 
either required to pay cash, required to exchange an existing security 
(valued at par), or given a choice of the two. Of the 86 new and 
additional offerings of bonds, notes, and certificates of indebtedness 
from 1932 to 1940, 15 required cash payment, 31 could be obtained 
only by exchange, and the remaining 40 gave the purchaser a 
choice. 13 

For reasons that will be discussed below, the Treasury’s regular 
practice was to fix the coupon rate on a new issue above the current 
interest rate for a bond of equivalent maturity, causing the initial 
price of the new bond in the securities market to exceed par. Ex¬ 
change allowed the holder of a maturing security to reap the benefit 
of this, giving value to the exchange privilege. 14 Of the 57 coupon- 
bearing securities that matured between 1932 and 1940, 54 could be 
exchanged at maturity for new issues that initially sold in excess of 
par. 

Cash payment was by subscription. Prospective purchasers made 
application for a certain amount of the issue and sent either 5 percent 
or 10 percent (depending on the issue) of the face value as a deposit. 
Subscription was guaranteed up to some level, usually $5,000 or 
$10,000. Individuals’ requests in excess of the minimum were filled as 
a percentage of the total of all applications. For example, subscribers 
to the 3.125 percent 1949-52 bond, whose issue was announced on 
December 3, 1934, were allotted 18 percent of the amount they re¬ 
quested, but not less than $10,000. Between 1932 and 1940, cash 


The total of 86 issues exceeds the actual number of new securities by 19 because of 
the practice of making additional offerings of already existing securities. 

Durand (1942) mentions the exchange privilege but implies that its value is derived 
from the saving in brokerage fees that comes from rolling over an investment. It is 
difficult to see why someone wishing a long-term security would buy a maturing one 
simply for th* benefit ofhaving it roll over. Thies (1985). in replicating the work of 
Durand, also notes the existence of the exchange privilege and correctly points out the 
•outre of its value. 
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subscribers, on average, were allotted 15.4 percent of their requests, 
but not less than $5,263. 

Once the allotment was determined, a cash subscriber could take 
delivery by paying the remaining balance. For example, a request for 
$100,000 might require a $10,000 deposit. If the final allotment were 
18 percent, then on delivery the subscriber would pay the balance of 
$8,000. Because the bonds were issued above par, a cash subscriber 
could make a profit by selling them immediately. In the case of the 
3.125 percent 1949-52 bond, the bid price on December 15, 1934, 
was 101 ls /<! 2 , implying a profit of l ls /» 2 . Alternatively, since the offer¬ 
ing announcement guaranteed a minimum allotment, in this case 
$10,000, a subscriber could sell the securities to a dealer on a when- 
issued basis. In this second case, the investor would take delivery of 
the bonds and immediately hand them over to the dealer, retaining 
the difference between par and the when-issued price that was previ¬ 
ously agreed on. 15 

Neither of the strategies associated with cash subscription was with¬ 
out risk. Since the market price of the bond on the issue date was 
uncertain at the time of subscription, there is clear risk in actually 
taking delivery and then selling the bonds on the open market. Since 
the allotment was not guaranteed, an investor had no way of know¬ 
ing the quantity that would be delivered and could not safely sell 
more than the guaranteed amount on a when-issued basis. Exchange, 
on the other hand, was less risky since the amount of the new issue 
received was always guaranteed. 

At this point, it is useful to compute the realized values of both the 
exchange privilege and the profit from cash subscriptions. The profit 
from cash subscription is easily determined by collecting data on the 
first quoted price of a new issue and taking the difference from par. 
In order to value the exchange privilege, information in the Trea¬ 
sury’s offering notices, reprinted in the Annual Report of the Secretary of 
the Treasury, was used to match each maturing note and bond, begin¬ 
ning with the 2 percent Treasury note maturing on March 15, 1932, 
with the new issues for which it could be exchanged. Then the value 
of each new security on its issue date was determined by using the 
closing quotation from the New York Tims on that date. 16 The realized 

18 Porter <1938, 1939) calls this a "free-ride” and describes in detail how to make a 
quick profit in the week preceding the new issue. She suggested subscribing and selling 
the guaranteed amount, then only $1,000, on a when-issued basis. According to Bed's 
account in the December 11, 1938, New York Times, Porter's article in the December 
1938 issue of Scribner’s Magazine set off a rush of subscriptions during that month and 
caused the Treasury to reduce the guaranteed amount. 

1S In several cases, no quote was found in the newspaper. For bonds, the first avail¬ 
able quotation reported lay ChUds (1947) was substituted, For notes, the first available 
quote was located in the New York Times. 
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value of the exchange privilege is the difference between the first bid 
price of the new issue and par. When a security could be exchanged 
for more than one new one, the value was assumed to be that of the 
most lucrative trade available. Obviously, the realized value was al¬ 
ways nonnegative. 

To illustrate the procedure, take an example. The 2.5 percent note 
issued on January 29, 1934, and maturings on March 15, 1935, could 
be exchanged for a 1.625 percent note maturing on March 15, 1940. 
The March 16,1935, New York Times reported the first bid on the new 
issue as IOIVss, 1.16 percent above par. 

Between 1932 and 1940, the average value of the exchange 
privilege realized by a holder of maturing coupon securities was 1.1 
percent with a standard deviation of 0.67 percent. Cash subscribers 
realized an average profit of 0.68 percent with a standard deviation of 
0.51 percent. 

It appears that the mechanism used to issue and refund Treasury 
debt involved giving away substantial amounts of money. But closer 
examination of both the legal and economic environment of the 
1930s leads to an explanation of the Treasury’s behavior. From the 
end of 1929 to the end of 1939 the interest-bearing debt of the U.S. 
government more than doubled, rising from $16 billion to $41 billion. 
Prior to the Depression, major buildups of government debt had 
occurred only during wartime and the severe depressions of the 
nineteenth century. As such, the Treasury had no real mechanism for 
issuing debt. The network of dealers and banks that serve to distrib¬ 
ute newly issued securities today was not yet in place. 

Current law also constrained Treasury actions. The Second Liberty 
Bond Act, which gave authority for the issuance of Treasury debt, 
required that new Treasury bonds and certificates of indebtedness be 
issued at par and new notes issued at not less than par (U.S. Depart¬ 
ment of the Treasury 1938). Given this statute, the only way to 
guarantee that a new issue would be sold (or maturing securities 
presented for exchange) was to set the coupon rate on the new bond 
or note above the current market interest rate on a comparable secu¬ 
rity. 17 

As is mentioned above, participation in either subscription or ex¬ 
change entailed risks, and so some sort of compensation was in order. 
With a potential exchange, there was no way of knowing what the 
value would be until the full transaction was complete. The character¬ 
istics of the new security were announced only a few weeks prior to 
maturity of the existing bond or note. For subscribers, there was the 

17 Perhaps surprisingly, auctions of coupon securities by the U.S. Treasury did not 
until 1970. Treasury bills, on the other hand, have been auctioned since their 
®«ption in 1929. 
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therefore are more likely to end up commuting outward or circum¬ 
ferentially because they cannot shift residential locations to commute 
inward , to suburban jobs or shift job locations to accommodate cen¬ 
trally located residences. Thus cities haying more two-worker house¬ 
holds or more black and minority workers are likely to have more 
wasteful commuting. Further research probing the redes of these and 
other factors in causing some workers to choose wasteful commutes is 
clearly needed. It is hoped that future urban economists will not have 
to characterize any commuting behavior as wasteful and instead will 
be able to explain it. 
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The Case of the Negative Nominal Interest 
Rates; New Estimates of the Term Structure 
of Interest Rates during the Great Depression 


Stephen G. Cecchetti 

Ohio Stale University and National Bureau of Economic Research 


Throughout the 1930s and early 1940s, U.S. Treasury bonds and 
notes appeared to have negative nominal yields as they approached 
maturity. But negative nominal interest rates are impossible in a 
world in which one can always hold cash. The resolution to this 
puzzle is that Treasury securities, in addition to making coupon 
payments, gave the owner the right to buy a new security on a Future 
date. This paper describes the institutional environment that led to 
the apparent negative nominal interest rates, develops a method for 
valuing the “exchange privilege,” and computes accurate measures 
of the yield to the coupon-bearing component of these composite 
bond/options. These corrected bond and note yields are then used to 
calculate new estimates of the term structure of interest rates from 
1929 to 1949. 


I. Introduction 

On December 31, 1932, the New York Times listed the yield on a 3.5 
percent U.S. liberty bond as -1.74 percent. This seems impossible. 
An investor can always hold cash rather than an interest-bearing secu¬ 
rity, so any bond should have a positive nominal yield. It is well known 

This paper is a revised version of NBER Working Paper no. 2472 (Cecchetti 1987). 
Much.of the work reported in this paper was completed while 1 was a visiting scholar at 
the Federal Reserve Bank of Kansas City. Thanks are due to Bob Cumby for disois- 
”° m and aid beyond the call of duty; Rick Mishkin for numerous conversations; Bob 
Barsky, Ernie Bloch, Hobart Carr, Ben Friedman, Ed Kane, Peter Temin, Paul Wach- 
*?• Eugene White, ami the participants in the 1987 NBER Summer Institute for discus- 
uuu* and comments on an earlier draft; and Ellen Nose, Rag Mital, Tom Dean, and 
Chuck Larson for untiring research assistance. 
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that during the Great Depression the prices of Treasury bills at auc¬ 
tion occasionally exceeded par. But the negative yields were ex¬ 
tremely small, on the order of - 0.05 percent? Yields of this small a 
magnitude can be explained by the fact that Treasury bills were ex¬ 
empt from personal property taxes in some states (see Homer 1977, 
p. 555) and that Treasury securities were required as collateral for a 
bank to hold U.S. government deposits. 8 Negative nominal yields on 
the order of - 2 percent are an entirely different story. In fact, from 
mid-1932 through mid-1942, the vast majority of coupon-bearing 
U.S. government securities bore negative nominal yields as they 
neared maturity. 3 

Since negative nominal yields are impossible in a world in which 
one can always hold cash, these securities must have had other attri¬ 
butes that were being valued. During the 1930s, the standard practice 
of the U.S. Treasury was to issue new bonds with coupon rates that 
implied market prices above par, but to sell them at par. Holders of 
maturing bonds and notes were given preferential treatment in the 
distribution of these new issues. Couport-bearing Treasury securities 
had what was called an "exchange privilege.” At maturity, they could 
be exchanged at par for a new issue. Government bonds and notes 
were not just coupon securities; they were options as well. 4 The option 
had value that was included in the quoted price. As a bond ap¬ 
proached maturity, this premium caused the price to rise high 
enough that the computed yield was negative. 

The solution to the first puzzle, that of the negative nominal inter¬ 
est rates, has given way to a second one: Why did the Treasury sell 
new issues at prices below those prevailing in the market? The answer 
to this question can be found by studying the institutional environ¬ 
ment of the 1930s. Legal constraints forced the Treasury to sell new 
securities at par. To ensure that an offering actually sold, the coupon 
rate had to be set above the current market interest rate. Initial pur¬ 
chasers were paid to place the new issue. This was the method of 
underwriting. 

The purpose of this paper is to describe the conditions that led to 
the apparent negative nominal interest rates and then use this infor- 

1 Bids in excess of par were received throughout 1939,1940, and 1941. The highest 
recorded was 100.018 on January 8, 1941. See the Annual Report of the Secretary of the 
Treasury (1941, p. 301). 

* 1 have also heard the claim that banks substituted Treasury bills for currency of 
smaller denomination in making interbank transfers, and so the negative yield reflected 
convenience. Unfortunately, this could not be substantiated. 

9 The plots in both Durand (1942) and the U.S. Treasury Bulletin for 1939 imply 
. negative nominal yields for maturities below 2 years. Childs (1947, p. 259) also notes 
the existence of negative nominal yields in the 1930s but provides no explanation. 

4 These options were very similar to the quality or delivery option associated with 
Treasury bond futures. See Figiewski (1986, p. 31) for a description. 
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mation to construct accurate data on the returns to holding U.S. 
government securities during the 1930s and 1940s. Proper computa¬ 
tion of the term structure during the 1930s requires careful examina¬ 
tion of the institutions of the bond market and Treasury debt man¬ 
agement. In what follows, a method for valuing the exchange 
privilege is described and used to correct the measurement of the 
yields of traded securities. These are used to construct term structure 
estimates from 1929 to 1949 that are consistent with those currently 
in use. These new data replace the sketchy data contained in the 
Federal Reserve Board’s Banking and Monetary Statistics of the United 
States and for the first time allow one to follow changes in the shape of 
the term structure during the Great Depression. The interest rate 
data can be added to new data on 3- and 6-month time loans in 
Mankiw and Miron (1985) and the new output, production, and un¬ 
employment data in Romer (1986a, 19866, 1988, 1989). 

There are two motivations for constructing this new data set. First, 
empirical research in macroeconomics often relies on the use of 
lengthy time-series data. 3 While Salomon Brothers (1985) publishes 
estimates of yields at 3 months, 1, 2, 3, 4, 5, 10, and 20 years to 
maturity beginning in 1950, 6 data on the term structure of interest 
rates prior to 1950 are noticeably missing. Second, the resurgence of 
interest in the economics of the Great Depression 7 makes it all the 
more important to exploit new data sources. 

The remainder of this paper is divided into four sections. Section II 
describes the raw data collected and used in the study. Section III 
provides a detailed account of the Treasury practices that caused 
nominal interest rates to be negative. The rationale for the Treasury's 
behavior is also examined. A method for valuing the exchange provi¬ 
sion is then proposed and used to compute the yield to the coupon- 
bearing component of the composite bond/option. Section IV uses 
these corrected yields to construct estimates of the term structure 
using a technique derived by Nelson and Siegel (1985). The conclud¬ 
ing section (Sec. V) provides a comparison of the new interest rate 
series with those previously available and finds that there are substan¬ 
tial differences. The adjustments for the exchange privilege lead to 
systematically higher estimates of yields at maturities below 5 years. 


This b true of the original work on business cycle dating summarized in Moore and 
Zamowitz (1986) and the more recent studies of the effects of money by Friedman and 
Schwartz (1982) and investment by Gordon and Veitch (1986). 

Recently, McCulloch (1987) has estimated coupon-corrected yield curves for De¬ 
cember 1946 to February 1987 that will likely replace the Salomon data in future 
research. 

’ Pa pers by Beraanke (1983, 1986), Field (1984), Bemanke and Powell (1986). and 
Hamilton (1987) and the essays in Brunner (1981) are examples. 



nt4 

n. Date 


JOURNAL OF POLITICAL ECONOMY 


Existing data on nominal interest rates prior to World War 11 are both 
limited in scope and imprecise. The Federal Reserve Board's Banking 
and Monetary Statistics of the United States contains several series for 
interest rates during the interwar period, but it is difficult to tell 
exactly how the numbers were constructed and what securities they 
actually refer to. For example, table 122 on page 460 of Banking and 
Monetary Statistics includes monthly series for 3-5-year tax-exempt 
Treasury notes, while table 128 on page 468 reports longer-term 
bond yields under the simple heading “U.S. Government.” The sec¬ 
ond of these refers to the unweighted average of the yield on all 
outstanding bonds with at least 12 years to maturity. Clearly, there is 
motivation for collecting a new and more complete set of interest rate 
data. 

Construction of a new data set on the term structure requires infor¬ 
mation on the prices of outstanding Treasury issues. These raw data 
were collected from the New York Times financial column entitled 
“Bond Sales on the New York Stock Exchange.” Quotes on the prices 
of all U.S. Treasury bonds, notes, and certificates on indebtedness 
were collected from the New York Times for the final trading day of 
each month from January 1929 to December 1949. The data set is 
complete in that it contains a yield for every bond, note, and 
certificate for every month during which it was in existence. It is 
composed of all 152 coupon-bearing securities either in existence in 
January 1929 or issued during the 21-year period examined. Of this 
total, 56 are bonds, 54 are notes, and 42 are certificates of indebted¬ 
ness. 

In addition to coupon securities, beginning in mid-1931 data were 
collected on the yield of Treasury bills with 3 months to maturity; 
prices are not reported. 8 As is currently the case, Treasury bills were 
pure discount securities. Other Treasury bills of shorter maturity 
were excluded since the major objective is to study yields at longer 
maturities. 9 

As is nearly always the case in research on financial markets, the 
data refer to dealer price quotes. There is no guarantee that actual 

* Childs (1947, p. 432) describe* early Treasury bill issue*. While the first Treasury 
bills were issued in 1929, h was not until 1931 that a series can be constructed that is 
composed solely of issues with 3 months to maturity. During 1929 and 1930, bills were 
issued at irregular intervals and matured in 3, 6, 9, or 12 months. Three-month Trea¬ 
sury bill rates were found for February, April, and May 1931 as well as every month 
beginning with July 1931. 

9 In addition, all interest-bearing government debt not issued directly by the U S 
Treasury, such as securities issued by the Federal Home Loan Bank Board or the 
Reconstruction Finance Corporation, is omitted. 
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transactions occurred at these prices. This problem is minimized by 
computing yields based on the mean of the bid/ask spread. But it is 
impossible to know how large an error comes from systematic differ¬ 
ences between dealer quotes and transactions prices. 

It is possible, however, to ensure that trading occurred. The New 
York Times does report volume. For example, on January 30, 1932, 
volume in the 3,375 percent Treasury bonds of 1940-43 amounted to 
$130,000. While this is a very small fraction of the nearly $360 million 
of the issue outstanding, it is important that there was some trading. 
To make the data set complete, in several isolated cases it was neces¬ 
sary to use price quotes that did not reflect trading on the New York 
Stock Exchange. These quotes were found in the New York Times 
under the heading “U.S. Bond Quotations—Closing Quotations for 
Issues Not Traded in on [sic] the Stock Exchange Yesterday." 

Since the majority of U.S. Treasury bonds issued during this period 
contained call provisions, there is a problem in computing the yield to 
maturity. Fortunately, except for several very special cases, all bonds 
were called on the first allowable date. As such, all yields were com¬ 
puted to the call date. 10 

The raw data set consists of 9,070 observations over 252 months, or 
just under 36 observations per month, on average. These raw data are 
available from the author on standard diskettes. As one would expect, 
the number of observations is small during the first few years, increas¬ 
ing substantially with the debt issues of the middle 1930s and again 
with the issues during World War II. In 1929, 1930, and 1931. there 
is an average of only 14 data points per month. By 1933, the average 
is over 20, rising steadily to 40 in 1939 and to 54 in 1945 and falling to 
38 in 1949. The implication is that the estimated yield curves will be 
less accurate for the earlier period simply because of the paucity of 
data. 

III. Negative Nominal Yields and the Exchange 
Privilege 

Consider the following exercise. Take the data described in Section II 
for a representative month and compute the yield to maturity for all 
the coupon-bearing securities based on the mean of the bid/ask 
spread. The results for February 1935 are plotted in figure l. 11 In the 

lv An alternative it to compute the yield to the call date when the price of the tecurity 
exceed* par and the yield to the final maturity date when the price is below par. Use of 
r u|e would have virtually no effect on the results since bonds nearly always sold at 
P r |as in excess of par, 

A similar diagram was constructed for every month of the data set. From 1954 to 
>941, all the figures bad the same general features as fig. 1. 
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figure, ATs refer to fully tax-exempt securities and P’s refer to par¬ 
tially tax-exempt securities. As is discussed in detail in Cecchetti 
(1987), this distinction is largely irrelevant prior to 1940 but impor¬ 
tant beginning in 1941. 12 The solid line is an estimate of a term 
structure using the technique described in Section IV. (Following the 
standard convention, all interest rates are in bond yield equivalents: 
two times the 6-month rate.) 

Figure 1 has several striking features. First, except for the single N 
representing the 3-month Treasury bill yield of 0.15 percent, the 
yield curve is smoothly upward sloping. If one were to neglect the 
vertical scale, the picture would not seem odd. The problem is that 
the lowest point is a Treasury note with 5 months to expiration and a 
yield of -1.25 percent. If this result were obtained for an isolated 
month, one would be inclined to check the raw price data for errors. 
But negative yields arise consistently from 1932 through 1942. 

Discussions of the period note the existence of negative nominal 
yields. They point out that during the 1930s the standard practice of 
the U.S. Treasury was to issue new bonds above par and give holders 
of maturing bonds, notes, and certificates preferential treatment in 
distributing the new issue. Maturing securities had an “exchange 
privilege” that gave them added value. 

The remainder of this section is divided into two parts. The first 

la Prior to 1941, the partial tax exemption had the effect of making the interest on 
bonds fully taxable to individuals but tax exempt to corporations. 
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provides a discussion of the institutional environment that led to the 
apparent negative nominal yields and discusses the reason for the 
Treasury to issue securities in the way that it did. This is followed by a 
detailed description of how to correct the data for the existence of the 
exchange privilege. 

A. The Exchange Privilege 

Each year, the Annual Report of the Secretary of the Treasury describes 
the offerings of securities during that year. In the 1930s, new offer¬ 
ings were announced from 1 to 2 weeks prior to the date of issue. The 
announcement stipulated the method of payment. The purchaser was 
either required to pay cash, required to exchange an existing security 
(valued at par), or given a choice of the two. Of the 86 new and 
additional offerings of bonds, notes, and certificates of indebtedness 
from 1932 to 1940, 15 required cash payment, 31 could be obtained 
only by exchange, and the remaining 40 gave the purchaser a 
choice. 8 

For reasons that will be discussed below, the Treasury's regular 
practice was to fix the coupon rate on a new issue above the current 
interest rate for a bond of equivalent maturity, causing the initial 
price of the new bond in the securities market to exceed par. Ex¬ 
change allowed the holder of a maturing security to reap the benefit 
of this, giving value to the exchange privilege. 14 Of the 57 coupon- 
bearing securities that matured between 1932 and 1940, 54 could be 
exchanged at maturity for new issues that initially sold in excess of 
par. 

Cash payment was by subscription. Prospective purchasers made 
application for a certain amount of the issue and sent either 5 percent 
or 10 percent (depending on the issue) of the face value as a deposit. 
Subscription was guaranteed up to some level, usually $5,000 or 
$10,000. Individuals’ requests in excess of the minimum were filled as 
a percentage of the total of all applications. For example, subscribers 
to the 3.125 percent 1949-52 bond, whose issue was announced on 
December 3, 1934, were allotted 18 percent of the amount they re¬ 
quested, but not less than $10,000. Between 1932 and 1940, cash 


The total of 86 issues exceeds the actual number of new securities by 19 because of 
the practice of making additional offerings of already existing securities. 

Durand (1942) mentions the exchange privilege but implies that its value is derived 
from the saving in brokerage fees that comes from rolling over an investment. It is 
difficult to see why someone wishing a long-term security would buy a maturing one 
“"Pb for the benefit of having it roll over. Thies (1985), in replicating the work of 
Durand, also notes the existence of the exchange privilege and correctly points out the 
source of its value. 
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subscribers, on average, were allotted 15.4 percent of their requests, 
but not less than $5,263. 

Once the allotment was determined, a cash subscriber could take 
delivery by paying the remaining balance. For example, a request for 
$100,000 might require a $10,000 deposit. If the final allotment were 
18 percent, then on delivery the subscriber would pay the balance of 
$8,000. Because the bonds were issued above par, a cash subscriber 
could make a profit by selling them immediately. In the case of the 
3.125 percent 1949-52 bond, the bid price on December 15, 1934, 
was 101 is /s!, implying a profit of l ls /sa. Alternatively, since the offer¬ 
ing announcement guaranteed a minimum allotment, in this case 
$10,000, a subscriber could sell the securities to a dealer on a when- 
issued basis. In this second case, the investor would take delivery of 
the bonds and immediately hand them over to the dealer, retaining 
the difference between par and the when-issued price that was previ¬ 
ously agreed on. 13 

Neither of the strategies associated with cash subscription was with¬ 
out risk. Since the market price of the bond on the issue date was 
uncertain at the time of subscription, there is clear risk in actually 
taking delivery and then selling the bonds on the open market. Since 
the allotment was not guaranteed, an investor had no way of know¬ 
ing the quantity that would be delivered and could not safely sell 
more than the guaranteed amount on a when-issued basis. Exchange, 
on the other hand, was less risky since the amount of the new issue 
received was always guaranteed. 

At this point, it is useful to compute the realized values of both the 
exchange privilege and the profit from cash subscriptions. The profit 
from cash subscription is easily determined by collecting data on the 
first quoted price of a new issue and taking the difference from par. 
In order to value the exchange privilege, information in the Trea¬ 
sury’s offering notices, reprinted in the Annual Report of the Secretary of 
the Treasury, was used to match each maturing note and bond, begin¬ 
ning with the 2 percent Treasury note maturing on March 15, 1932, 
with the new issues for which it could be exchanged. Then the value 
of each new security on its issue date was determined by using the 
closing quotation from the New York Times on that date. 16 The realized 

'* Porter (1938, 1939) calls this a “free-ride” and describes in detail how to make a 
quick profit in the week preceding the new issue. She suggested subscribing and selling 
the guaranteed amount, then only $1,000, on a when-issued basis. According to Bell’s 
account in the December H, 1938, New York Times, Porter's article in the December 
1938 issue of Scribner's Magazine set off a rush of subscriptions during that month and 
caused the Treasury to reduce the guaranteed amount. 

*® In several cases, no quote was found in the newspaper. For bonds, the first avail¬ 
able quotation reported by Childs (1947) was substituted. For notes, the first available 
quote was located in the New York Times. 
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value of the exchange privilege is the difference between the first bid 
price of the new issue and par. When a security could be exchanged 
for more than one new one, the value was assumed to be that of the 
most lucrative trade available. Obviously, the realized value was al¬ 
ways nonnegative. 

To illustrate the procedure, take an example. The 2.5 percent note 
issued on January 29, 1934, and maturing on March 15, 1935, could 
be exchanged for a 1.625 percent note maturing on March 15, 1940. 
The March 16, 1935, New York Times reported the first bid on the new 
issue as 101Vs2, 1.16 percent above par. 

Between 1932 and 1940, the average value of the exchange 
privilege realized by a holder of maturing coupon securities was 1.1 
percent with a standard deviation of 0.67 percent. Cash subscribers 
realized an average profit of 0.68 percent with a standard deviation of 
0.51 percent. 

It appears that the mechanism used to issue and refund Treasury 
debt involved giving away substantial amounts of money. But closer 
examination of both the legal and economic environment of the 
1930s leads to an explanation of the Treasury's behavior. From the 
end of 1929 to the end of 1939 the interest-bearing debt of the U.S. 
government more than doubled, rising from $ 16 billion to $41 billion. 
Prior to the Depression, major buildups of government debt had 
occurred only during wartime and the severe depressions of the 
nineteenth century. As such, the Treasury had no real mechanism for 
issuing debt. The network of dealers and banks that serve to distrib¬ 
ute newly issued securities today was not yet in place. 

Current law also constrained Treasury actions. The Second Liberty 
Bond Act, which gave authority for the issuance of Treasury debt, 
required that new Treasury bonds and certificates of indebtedness be 
issued at par and new notes issued at not less than par (U.S. Depart¬ 
ment of the Treasury 1938). Given this statute, the only way to 
guarantee that a new issue would be sold (or maturing securities 
presented for exchange) was to set the coupon rate on the new bond 
or note above the current market interest rate on a comparable secu¬ 
rity. 17 

As is mentioned above, participation in either subscription or ex¬ 
change entailed risks, and so some sort of compensation was in order. 
With a potential exchange, there was no way of knowing what the 
value would be until the full transaction was complete. The character¬ 
istics of the new security were announced only a few weeks prior to 
maturity of the existing bond or note. For subscribers, there was the 

17 Perhaps surprisingly, auctions of coupon securities by the U.S. Treasury did not 
begin until 1970. Treasury bilk, on the other hand, have been auctioned since their 
wcepuon in 1929. 
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uncertainty about the size of the allotment and the movement of 
interest rates over the week prior to the physical delivery of the secu¬ 
rities. The compensation for this risk is analogous to the fee paid to 
underwriters of corporate securities who commit themselves to selling 
a fixed quantity of a stock or bond at a given price on a future date, 
thereby assuming the risk inherent in price fluctuations. 

Two pieces of* evidence support the view that the exchange 
privilege and the profit to cash subscripdon were underwriting 
spreads. First, the magnitude of the differential is appropriate. Co¬ 
han (1961), in his study of the cost of floating private debt in the 
1930s, concludes that gross underwriting spreads for offerings of Aaa 
public utility bonds between 1935 and 1940 ranged from 1.65 percent 
to 2.01 percent. The discrepancy between this and the approximately 
1 percent compensation for underwriting Treasury issues is easily 
explained by differences in risk. 18 

Additional evidence comes from looking at the identity of the initial 
purchasers of the Treasury’s new offerings. During the 1930s, indi¬ 
viduals in the Second Federal Reserve,District, New York, were al¬ 
lotted over 50 percent of all new securities (on either subscription or 
exchange). It is natural to conclude that the banks and dealers in New 
York City, who dominate this Federal Reserve district, were being 
paid a fee to ensure placement of the bonds. 19 

The impact of the legal constraints is also easy to demonstrate. 
Again, take the example of the 1.625 percent note issued on March 
15, 1935, and maturing 5 years later. As has already been noted, on 
March 16, 1935, the first bid for the new issue was lOlYss. This 
implies a yield to maturity of 1.38 percent. During this period, there 
seems to have been a convention that all coupon rates were quoted in 
even eighths. 20 While the Treasury could have set the coupon rate at 
1.5 percent and still sold the issue—the initial price would have been 
approximately 100 19 /s 2 —this may not have been viewed as sufficient 
compensation for potential underwriters (brokers or individuals) to 
be willing to accept the risk associated with subscribing to this new 
issue. 

Contemporary beliefs, as expressed in the January 2,1939, issue of 


18 The fact that issues were heavily oversubscribed suggests that the payment offered 
by the government exceeded the market-clearing underwriting fee. But since every¬ 
one knew how the subscription procedure worked, there must have been substantial 
gambling involved in determining the subscription amounts. 

19 This is similar to the underwriting mechanism of the 1950s described in Bloch 
(1964). Then banks were allowed to buy new issues by simply crediting the Treasury’s 
tax and loan account at that bank. 

*° The first coupon security that did not have a coupon rate that was a multiple of 
one-eighth was a 0.90 percent note issued on December 1, 1944. 
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Barron's, support this view. An article entitled “Valuing of ‘Rights’ in 
Treasury Notes” states in part that “at the present time, the Treasury 
is faced with the prospect of having to borrow substantial amounts of 
new money for some time to come. In addition, there is a large vol¬ 
ume of short-term Treasury obligations that must be refunded dur¬ 
ing the next few years. Under these circumstances, [Treasury] Secre¬ 
tary [Henry] Morgenthau has apparently concluded that it is wise to 
make new United States Treasury issues unusually attractive to inves¬ 
tors" (p. 1). 

B. Computing the Corrected Yields 

An estimate of the market value of the exchange privilege substan¬ 
tially prior to the maturity of a security is needed to correct the data 
for the value of the exchange privilege. 21 The effect of the exchange 
privilege is to raise the price of a bond above what it otherwise would 
be. An interpretation of this is that securities were trading as if their 
face value exceeded 100 by a “bonus” representing the value of the 
exchange privilege. Once the bonus is estimated, the yield to the 
coupon-bearing component of the composite security can be com¬ 
puted. 

The realized value of the exchange privilege—computed by assum¬ 
ing that an investor holds a bond to maturity, makes the exchange, 
and sells the new security on the day of issue—is of no use. As is dear 
from the previous discussion, the realized value is a biased estimate of 
the market’s expectation since it includes an underwriting spread. 
Fortunately, an arbitrage condition can be used to value the exchange 
privilege and correct the yield estimates. 

All coupon-bearing securities in the sample made payments at 6- 
month intervals. This means that all notes, bonds, and certificates 
with less than 6 months to maturity were pure discount securities. 22 
Beginning in June 1981, the government regularly issued 3-month 
Treasury bills. Arbitrage implies that the yield on a note with less than 
6 months to maturity and a bill maturing on the same day must be the 
same. This provides a simple way of calculating the market (or im¬ 
plied) value of the exchange privilege. Three months or less prior to 
maturity, each coupon-bearing security can be matched with a Trea¬ 
sury bill maturing on the same day. The implied value of the ex¬ 
change privilege is the difference between the traded price of the 


1 The same article in Barron's quoted above contains subjective estimates ot the value 
°f dse exchange privilege that differ by small amounts from those computed here. 

The fart that interest on coupon-bearing securities accrues linearly introduces a 
small error that is imperceptible at low interest rates. 
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security and the price implied by the Treasury hill rate, appropriately 
discounted. 21 * 

To see how the computation is done, define P as the price quoted in 
the newspaper for a bond nearing maturity. An individual purchas¬ 
ing the bond must pay this price plus accrued interest. Interest on 
government securities accrues linearly between coupon payments. 
Assume that the bond pays a coupon $C per year, or $VsC every 6 
months, and has m years to maturity. Since m is less than one-half year 
(the bond has less than 6 months to maturity), the last coupon pay¬ 
ment was (‘/a - m) years ago, and the accrued interest is VsC(Vn - m). 
The price with accrued interest is just P' = P + VaC( l /s — m). Arbi¬ 
trage requires that the yield to holding this security equal the yield to 
holding a Treasury bill maturing in m years, call this r. The implied 
value of the exchange privilege {Pr f ) is calculated from the arbitrage 
relationship 24 


100 + VzC + Pr f 
(i + r) M 


( 1 ) 


The computation is very simple. Take the example of the 2.5 per¬ 
cent note maturing on March 15, 1935. On December 30, 1934, with 
2'/s months to maturity, the dosing quotation for the mean of the bid/ 
ask spread was 101.19, so the actual price with accrued interest was 
101.19 + 2.5(3.5/12) = 101.92. If calculated naively, this implies a 
nominal yield to maturity of -3.28 percent at an annual rate. On the 
same date, the Treasury bills maturing on both March 7 and March 
21, 1934, yielded a 0.20 percent bid, but no ask is reported. Assuming 
a bid/ask spread of Vs 2 indicates a mean bid/ask spread yield of 0.05 
percent. The implied value of the exchange privilege is calculated as 
the face value that is consistent with a price of 101.92 and a yield of 
0.05 percent: 


101.92 


100 + (2.5/2 ) + Pr 1 
(1 + ,0005) 2 ' 5/12 


( 2 ) 


For this case, the value of Pr' is 0.68. The bond is trading as if its face 
value were 100.68. As noted above, the bond could have been traded 
in for a new security selling for 101.16. So while the realized value was 
1.16, the implied or market expected value was only 0.68. 

This procedure was employed for all coupon securities maturing 


** This ignores the tax distortions in the Treasury bill data mentioned in the In¬ 
troduction, which are clearly small relative to the problem caused by the exchange 
privilege 

14 Tax considerations do not affect this calculation since both interest and capital 
gains on government securities were tax exempt prior to 1941. See sec 4.1 of Cecchetti 
(1987) for a partial discussion. 
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between March 1932 and December 1944. 25 All the estimates are 
based on the mean of the bid and ask price and maturity dates that 
match within 3 days. When an ask price was not available, one was 
computed from the bid assuming a bid/ask spread of a /s 2 . 26 (It is worth 
noting that marking the coupon security to the Treasury bill rate 
makes the information in the note or bond yield redundant. As such, 
the yield curves estimated in Sec. IV do not utilize the coupon security 
yields at shorter maturities.) 

A simple univariate regression can be used to summarize the rela¬ 
tionship between the realized and the market expected value of the 
exchange premium. If one assumes that the realized premium ( Pr) 
equals the expected premium (Pr*) plus an orthogonal error, the ap¬ 
propriate regression is (standard errors are in parentheses) 

Pr(i) = .461 + 1.032 Pr*(i), (3) 

(.161) (.216) 

mean of Pr = 1.036, number of observations «* 65, R = .25. 

As anticipated, the market-implied premium is correlated with the 
realized value but systematically underestimates it. 27 The evidence 
supports the hypothesis that the value of the exchange privilege was 
related to its function as an underwriting fee. 

Once the implied market value of the exchange privilege is deter¬ 
mined for every relevant coupon security, the yields can be recom¬ 
puted. For each security, Pr'(i) is assumed to be an increment to the 
face value. The yield is recomputed for the entire lifetime of the note 
or bond assuming that the face value is [100 + P/(t)J, not the usual 
100. For example, in the case of the 2.5 percent note described above, 
the yield for every month from January 1934 to February 1935 was 
recomputed assuming that the face value was 100.68. 2H 

ls While the practice of allowing payment by exchange continued beyond 1944. the 
terms were no longer as favorable. Allotment was not guaranteed, and so the value of 
‘•‘^“privilege" disappeared. 

The results are not sensitive to the use of either the bid or the ask in place of the 
midpoint of the spread: the estimated values of the exchange privilege change bv less 
than .0001. 

The comparison assumes that an individual cashes in the new security on the day k 
»s issued. Prior to December 1940, the capital gain from the sale of a note or certificate 
of deposit was nontaxable. If, however, the premium were taxable as a short-term 
capital gain, this would provide another explanation for the difference between the 
J*«***d and implied values in eq. (3). For reasons that are described in sec. 4 of 
r* ™* 1 ** 0987), « is only beginning in 1941 that the tax effects could have been 
’*"*««* Examination of the data shows that the relationship between the realized 

H&piied premium has no systematic difference over the two periods. 

. lb** both the realized and the market expected values of exchange 

povuege were always strictly greater than zero suggests that the option was always in 
j” Kme V If this wtis the case, an options pricing model is not needed to compute die 
~T*y m va h je of the exchange privilege going back in time. The market implied 
must decay at the rate of interest, implying that the method used » correct. 
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Adopting this procedure entails making a very strong but unavoid¬ 
able assumption. For the entire lifetime of a bond, market partici¬ 
pants are assumed to anticipate perfectly what the value of the ex¬ 
change privilege will be when the security reaches 3 months to 
maturity. Since the prices of all government securities, except for 
Treasury bills, were subject to the distortions of the exchange 
privilege, there is no other way of determining the implicit value of 
the coupon-bearing component of a bond or note at any time other 
than when its maturity is less than 3 months. Since no other data are 
available, there is no other way to proceed. 29 

Figure 2 plots the estimated yields corrected for the value of the 
exchange privilege for February 1935. Again, P and N denote differ¬ 
ences in tax treatment, and the solid line is a term structure using the 
techniques described in Section IV. For longer maturities, in excess of 
7 years or so, the data are nearly identical to the uncorrected yields 
plotted in figure 1. But for the shorter maturities, below 5 years, the 
yields are now strictly positive and smoothly upward sloping. Further¬ 
more, the 3-month Treasury bill yield that is so much higher than the 
remainder of the yield curve in figure 1 no longer stands out. 

There is obviously more noise in the corrected data than in the raw 
data in figure 1. Any plot of the yield to maturity against the time to 

** It is possible to examine the fluctuation in the market value of the exchange 
privilege associated with a given security by recomputing the value of Pr' for the 
observations when the bond or note had less than 8 months to maturity. The results of 
this exercise show only small movements. 
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naturity for coupon securities will produce a smooth pattern, even in 
teory, only if all the coupon rates are the same. This explains why, 
•ven at longer maturities, the figure reveals small vertical displace- 
lents. Matters are obviously worse at the shorter maturities. While 
his may be due to inaccuracies in the quoted prices, 30 some of the 
;rrors are too large to be accounted for by anything but mismeasure- 
ment of the value of the exchange privilege. For example, on Decem- 
ier 31, 1938, the corrected yield to maturity for the 2.125 percent 
lote maturing on June 15,1939, is estimated to be 1.025 percent (the 
incorrected yield is -2.13 percent). At the same time, the Treasury 
sill maturing on March 29, 1934, yielded 0.05 percent. A data point 
ike this one is clearly visible on a scatter plot. The error is much too 
arge to be the result of an error in a price quote that is incorrect by 
everal thirty-seconds. Mismeasurement of the value of the exchange 
srivilege is the clear source. It is important to keep in mind that the 
ihorter the time to maturity, the larger these errors become. As such, 
n the following analysis all coupon-bearing securities with maturity of 
ess than 6 months are omitted. 

Errors and all, the corrections for the exchange privilege are ex- 
iremely important. They completely eliminate the existence of appar¬ 
ent negative nominal interest rates on coupon-bearing securities and 
How computation of yield curves for the 1930s. 

V. Estimating the Term Structure 

To estimate the term structure, it is necessary to fit a curve through 
he scatter of points similar to figure 2 for each month of the sample. 
Adhere is a large literature on estimating the term structure of interest 
rates. 31 What is needed here is a technique that provides a sufficiently 
iroad set of alternative shapes but is parsimonious in its parameteri¬ 
zation. Considering that the early months have fewer than 15 data 
joints apiece, it is important to use a method that requires estimation 
if the fewest parameters possible. 

Nelson and Seigel (1985) derive a four-parameter model that allows 
or humped, monotonic, and S-shaped yield curves. Their specifi- 
ration, derived as the solution to a differential equation relating the 


Price quotes are for the end of the day but reflect information and revision at 
fferent times of the day. Because of the nonlinearity of the yield computation, price 
{uotation errors cause larger errors in yields at shorter maturities. 

Durand (1942,1958) and Durand and Winn (1947) pioneered the field by drawing 
reehand curves through scatter diagrams. McCulloch (1971) and Shea (1985) provide 
^wtninatkms of analytical curve-fitting techniques that use cubic splines and exponen¬ 
ts. Brown and Dybvig (1986) estimate yield curves using the model derived by Cox, 
•tgenoll, and Ross (1985). 
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forward rate to the time to maturity, is 

R(m) - a + b + ce -~» t (4) 

where R(m) is the yield to maturity m, and o, b, c, and t are parame¬ 
ters. While Nelson and Siegel apply (4) to data on pure discount 
securities, here it is used as an approximation for all securities regard¬ 
less of whether they are coupon-bearing or not. 

It would be preferable to use a technique that accounts for the fact 
that securities with different coupon rates and the same maturity date 
are expected to have different yields. But methods such as those in 
McCulloch (1971) or Brown and Dybvig (1986) require large amounts 
of data. The errors that are introduced by ignoring the differences in 
coupon rates are clearly small relative to the corrections made for the 
exchange privilege. 32 

Equation (4) was modified to take into account the differences in 
the tax treatment of interest payments on various securities. 33 From 
1929 to 1940, interest on Treasury bills, notes, and certificates of 
indebtedness was completely exempt from both personal and corpo¬ 
rate taxes. Interest on Treasury bonds, however, was partially tax 
exempt. But because of the nature of the tax exemption and the 
specifics of the corporate tax code, the interest derived from Treasury 
bonds was essentially tax exempt to corporations. 

Beginning in 1941, all new issues bore interest that was fully taxable 
regardless of the owner. Furthermore, changes in the corporate tax 
code made interest from partially tax-exempt bonds taxable, but at a 
rate that was lower than that on new issues. 

The implication of this is that data exist for estimation of a nontax- 
able term structure from 1929 to 1940 and a taxable term structure 
from 1941 to 1949. Specifically, for the 1941-49 period, a multiplica¬ 
tive constant can be added to the Nelson and Seigel model that allows 
yields on securities with different tax status to differ systematically. 
This gives the specification 

R(m) = (1 + apDf, + aA)/(«, 6), (5) 

where Dp * 1 if the security is partially tax exempt and 0 otherwise, D, 
— 1 if the security is fully taxable and 0 otherwise, f(m, 8) is the 
Nelson and Seigel function in (4), and the a’s are parameters that 


** A comparison of figs. 1 and 2 shows that coupon rate differences cause errors on 
the order erf 0.1 percentage points, while adjustment for the exchange privilege in¬ 
creases measured yields by amounts in excess of a full percentage point. 

ss Section 4 of Cecchetti (1987) provides a detailed account of these tax con¬ 
siderations. 
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measure the difference between either partially tax-exempt or fully 
ta ya hie securities and nontaxable ones. 

For each month from January 1929 to December 1949, estimation 
of equation (5) proceeded as follows. First, because of the inaccuracies 
in the procedure for valuing the exchange privilege, all coupon secu¬ 
rities with less than 6 months to maturity were omitted.* 5 From 1934 
to 1945, the Treasury bill rate was usually below 0.25 percent. As 
such, the yield curves came very close to zero at short maturities. If 
left unconstrained, estimates of the yields at 3 months to maturity 
were occasionally negative. The solution is to force the estimates to go 
exactly through the Treasury bill rate at a maturity of 3 months. The 
constraint is imposed by restricting the value of the constant term a in 
equation (4). 

Finally, as suggested by Nelson and Seigel, estimation was condi¬ 
tional on the parameter t. Plots of the data show that the yield curve 
becomes flat at longer maturities, suggesting that t should not be in a 
range above 200, and so a search was done over a grid from 10 to 250 
in increments of 10. The final estimate minimized the sum of squared 
residuals over this range. 56 

The results are yield curve estimates few each month. For February 
1935, the fitted values are plotted as the solid line in figure 2. As can 
be seen from the figure, the line fits fairly well for maturities con¬ 
tained in the data set. In fact, the fitted values account for 90 percent 
of the variation in the data in over 200 of the 252 months. Extrapola¬ 
tion to maturities longer than those existing in the data can be mis¬ 
leading, however. The fitted values turn down at longer maturities, 
while the scatter plot shows no signs of a downward slope. This im¬ 
plies that the estimates are likely to be unreliable at maturities past 20 
years and have only limited accuracy past 15 years. 


54 An important problem in more recent yield curve estimation does not arise here. 
McCulloch (1975) discusses how, for fully taxable securities selling below par, the 
differential tax treatment of the principal appreciation and the coupon payment can 
produce misleading results. But for the period under study, bonds sold almost exclu¬ 
sively above par. 

** Because of both their call provisions and their tax status, the 3.5 percent Treasury 
notes of March, September, and December of 1930-32 were also omitted. These notes 
were fully taxable and tended to fall on the yield curve when the maturity date was 
assumed to be the final redemption date. Since they were actually called during 1930 
and 1931, it was unclear how to differentiate the value of the call provision from the 
value of the coupon payments. In addition, after December 1930, all liberty bonds were 
omitted. These were issues used to finance World War I that contained provisions that 
“owed them to be called beginning in 1932. They are the only bonds in the sample 
were not called on the first allowable date. 

It was not possible to estimate t by simple nonlinear least squares. For a number of 
•oooths, the estimate.of t grew too large. As t grows, e~ mJ ' goes to *ero and e in (4) 
cann °t be estimated. 
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Fic. 3.—Comparison of FRB 3-5-year and new 4-year estimates, 1933-40 


The Appendix reports estimates of nominal interest rates at 
maturities from 3 months to 20 years, monthly from January 1929 to 
December 1949. The data are for the last trading day of each month. 
The full data set is available from the author on standard diskettes. 


V. Concluding Remarks 

The mystery of negative nominal interest rates has been solved. The 
legal and economic environment of the 1930s restricted the method 
in which the Treasury issued and refunded coupon-bearing securi¬ 
ties. The Treasury was required by law to issue new bonds at par, and 
to ensure that an offering sold, coupon rates were set so that initial 
market prices exceeded par. In this way individuals and brokers were 
paid an underwriting fee to place the new securities. Since holders of 
maturing securities were given preference in the distribution of a new 
issue, the quoted prices reflected the value of an exchange privilege: 
the option to hold the bond or note to maturity and roll it over into a 
new security. The increase in the price was large enough that the 
yield, computed in the standard way, appeared negative. Adjustmen 
for this distortion in the price allows recomputation of the yield to th< 
coupon-bearing component of the composite bond/option. 

Taking account of the value of the exchange privilege is obvious) 
important. Any comparison of nominal interest rates with and wit! 
out the adjustment shows systematic differences. Figure 3, for e> 
ample, {dots the Federal Reserve Board’s (FRB) series entided 3-1 
year tax-exempt Treasury notes, from Banking and Monetary Statisti 
table 122, against the new estimates for tax-exempt yields on U. 
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Fig. 4.—Comparison of FRB government bond series and new 10-year estimates, 
1929-40. 

government securities with 4 years to maturity. Figure 4 compares the 
new 10 years to maturity estimates with the FRB series for U.S. gov¬ 
ernment bonds. 

Both plots show striking differences. As one would expect, the FRB 
medium-term series is systematically too low since it fails to account 
for the value of the exchange privilege. The new 4-year estimates are 
on average 0.27 percentage points, or 30 percent higher. This repre¬ 
sents not only a revision in the level of the nominal interest rate for 
this period but an increase in the estimate of the real interest rate as 
well. 

Differences are also apparent in comparing the old and new series 
for long-term yields. This time the FRB series is higher than the new 
10-year series from 1935 through 1940. Over the entire period, the 
average level of the FRB series is 0.16 percentage points higher than 
the 10-year series. In addition, the old series is too stable, with a 
standard deviation 0.22 percentage points below the new 10-year esti¬ 
mates. Examination of the new series at longer maturities shows that 
the FRB bond data are close to the new estimates at 15 years to 
maturity. 

There is no question that these new data are useful. Much of the 
argument over the causes of the length and depth of the Great De¬ 
pression turns on attempts to interpret movements in interest rates. 
The new data will allow detailed study of a type that could not have 
previously been undertaken. In particular, they can be used to exam¬ 
ine movements in the slope of the term structure and shifts in the 
spread between .corporate and U.S. government bond yields in the 
crucial period from 1929 to 1933. Future research will use these new 
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data in an attempt to differentiate among the various theories for the 
causes of the most severe economic downturn of the twentieth cen¬ 
tury. 37 


Appendix 

Table At contains the constant maturity nominal term structure estimated 
using the Nelson and Seigel (1985) specification described in Section IV. 
From 1929 to 1940, the estimates are for nominal yields on wholly tax-exempt 
securities. From 1941 to 1949, the estimates are for nominal yields on fully 
taxable securities. AH data refer to the last trading day of the month and are 
available from the author on machine-readable diskettes. 


87 Cecchetti (1988) reports the first attempt at studying the implications of these new 
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Welfare Effects of British Free Trade: 
Debate and Evidence from the 1840s 


Douglas A. Irwin 

Board of Governors of the Federal Reserve System 


The classical economists engaged in a vigorous debate over whether 
Britain’s tariff reductions in the 1840s should be made contingent on 
tariff liberalization abroad. Some, notably Robert Torrens, believed 
that a unilateral tariff reduction would so deteriorate British terms 
of trade as to outweigh efficiency gains and make the country worse 
off. In this paper, Britain's foreign trade elasticities are estimated for 
this period in a simultaneous equation model. They are used in a 
simple general equilibrium model that explicitly takes the terms of 
trade into account to assess the welfare impact of tariff reductions. 
The results indicate that Britain would have been made worse off 
from a unilateral tariff reduction. However, foreign tariff reduc¬ 
tions mitigated the terms of trade deterioration and could easily 
have made Britain better off. 


I. Introduction 

During the quarter century after Peel’s tariff reforms in 1842, and 
gathering momentum with the repeal of the Corn Laws in 1846, 
Britain shifted its commercial policy from protection to free trade. 
Although many factors contributed to this dramatic change in policy, 
the classical economists had for decades been stressing the benefits of 
tariff reduction and freer international trade. In particular, they uni¬ 
versally condemned the protection accorded British agriculture by 
the Com Laws. 

1 thank Jagdish Bhagwati, Michael Edelstein, Douglas Holtz-Eakin, Steven Husttd. 
Donald McCloskey, Michael Mussa, Jeffrey Williamson, and numerous seminar paruu- 
pants for their helpful comments. This research was supported by a Sloan Foundation 
grant to the Department of Economics at Columbia University, where this paper was 
completed. The usual disclaimer applies. 
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Essentially two issues were debated in the controversy over the 
Com Laws and other tariffs. The first concerned the impact of these 
measures on the domestic distribution of income. In focusing on the 
effects of protection on the welfare of particular classes or sectors, the 
classical economists asserted that the Com Laws harmed labor and 
industry, a view rejected by defenders of protection. Williamson 
(1986) reviews this debate and uses a detailed general equilibrium 
model to analyze the income distributional consequences of the Com 
Law repeal. He finds that British labor did suffer from the Com Laws 
but that the impact on manufacturing is less dear and depends on 
whether Britain was a large country on world markets. 

The second issue concerned Britain's trade policy in general and 
addressed the effect of tariff reductions on British welfare overall. At 
that dme, international trade theory was sophisticated enough to rec¬ 
ognize that trade restrictions, and not free trade, would in prindple 
increase national income for a country that could influence its terms 
of trade. Thus a unilateral tariff reduction by such a country would 
have two opposing effects, a gain from more efficient domestic re¬ 
source allocation and a loss from trading along inferior terms of 
trade. While the classical economists were united about the signifi¬ 
cance of improved resource allocation, they were divided about the 
importance of the terms of trade effect. Consequently, economists 
such as Robert Torrens and John Stuart Mill expressed caution about, 
or even outright opposition to, a purely unilateral reduction of the 
Corn Laws and other tariffs. Others, such as Nassau Senior and John 
Ramsay McCulloch, denied that tariff liberalization needed to be re¬ 
ciprocated and either ignored terms of trade considerations or 
thought they would be minor compared with the benefits from im¬ 
proved resource allocation. 

This debate has been largely forgotten, and subsequent generations 
of economists have believed with little qualification that free trade 
brought advantage to Britain. In fact, it was not until McCloskey 
(1980) raised the point that Britain may have “returned some of the 
booty" from trade to other countries by maintaining a suboptimal 
tariff that economic historians have even considered otherwise. After 
making a best-guess calculation of Britain’s optimal tariff in the mid¬ 
nineteenth century, McCloskey concluded that tariff reforms includ¬ 
ing the repeal of the Corn Laws reduced Britain’s tariff below the 
optimal level, thereby creating a "magnanimous Albion” for the rest 
of the world. As Harley and McCloskey (1981, p. 61) argued, “if 
anything, the move towards free trade in the 1840s and 1850s hurt 
Britain.” 

Despite Williamson’s thorough analysis of the economic conse¬ 
quences of Com Law repeal for various sectors, no systematic attempt 
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has been made to quantify the overall static welfare effects of B ritain 
move to free trade during this period. This paper attempts to evalu¬ 
ate the impact of British tariff reforms that began in the early 1840s. 
It will attempt to find out who was right—Torrens, that adverse terms 
of trade effects from tariff reduction would make Britain worse off, 
or Senior, that improved domestic resource allocation would mal^ 
Britain better off—by explicitly estimating those two effects. After a 
review of the controversy among the classical economists over the 
issue, data from 1820 to 1846 will be used in Section III to estimate 
elasticities of British foreign trade using a simultaneous equation 
model. These are needed in Section IV, which utilizes a general equi¬ 
librium model of the welfare effects of tariffs based on Basevi (1968). 
This model provides a method for estimating welfare changes from 
tariff reductions, taking into account their effect on the terms of 
trade. A brief conclusion summarizes the findings. 

II. The Tariff and Terms of Trade Debate 

In the early nineteenth century, most British economists advocated 
free trade by stressing the general benefits derived from an interna¬ 
tional division of labor. When trade theory began to address the de¬ 
termination of the terms of trade and the division of gains from tariff 
reduction, the notion that a large country could use a tariff to im¬ 
prove its terms of trade weakened this unambiguous policy implica¬ 
tion. For those economists who thought terms of trade considerations 
could raise serious questions about a unilateral British tariff reduc¬ 
tion, yet at the same time were committed to free trade, the question 
of reciprocity had to be confronted: Should domestic tariff reductions 
be made contingent on similar actions abroad, even at the risk of 
postponing or even forgoing the opportunity for domestic reforms? 
Given this dilemma, the classical economists, good free traders all, 
could differ significantly with respect to the commercial policy tactics 
they would advocate. 

The most strident and controversial critic of unilateral free trade 
was, ironically enough, the economist who shares with David Ricardo 
the credit for developing the theory of comparative advantage, 
Robert Torrens. Torrens was the leading exponent of the view that 
unilateral tariff reductions would be detrimental to British welfare. 
His analysis hinged on two Ricardian concepts. First, international 
demand, and not costs of production alone, plays a role in determin- 


1 For an excellent overview of Torrens’s work, see Robbins (1958). Torrens origins 11 )' 
conceived these views in 1832 (see Torrens 1833) but developed them considerably tn 
the early 1840s during the tariff debates, which will be the focus here. 
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ing the terms of trade. Second, commercial policies affect the interna¬ 
tional distribution of precious metals through the specie-flow mecha¬ 
nism. From these two precepts, Torrens asserted (1844, p. 28) that 
proposals in 1841 to cut tariffs “would have been the greatest calamity 
which could have befallen the country, and might possibly have led to 
national bankruptcy and revolution.” This conclusion was derived 
from the proposition that 

when any particular country imposes import duties upon the 
productions of other countries, while those other countries 
continue to receive her products duty free, then such partic¬ 
ular country draws to herself a larger proportion of the pre¬ 
cious metals, maintains a higher range of general prices than 
her neighbours, and obtains, in exchange for the produce of a 
given quantity of her labour, the produce of a greater quantity of 
foreign labour. [Emphasis added] 

These precious metals, he wrote, could be recovered by tariff retalia¬ 
tion to restore the previous exchange ratio. 

At this juncture, Torrens introduced a numerical example of tariffs 
and trade between Cuba (representing the rest of the world) and 
Britain that, he contended, demonstrates the proposition. If Cuba 
imposes tariffs from a situation of perfectly free trade, Britain will 
initially find itself importing the same value of goods from Cuba but 
exporting less. Specie flow will make up the difference, thereby re¬ 
ducing British prices and raising Cuban prices. While he made some 
particular assumptions, such as constant British outlays on Cuban 
products, Torrens concluded that his example proved that the “ulti¬ 
mate incidence of the import duty imposed upon British goods would 
be upon the British producers. The wealth of England would be 
decreased by the amount of the duty—the wealth of Cuba would be 
increased by its amount" (p. 36). 

Torrens’s policy recommendations sparked a controversy among 
economists that even spilled over into parliamentary debates. He in¬ 
sisted that “the following practical rules of commercial policy are 
direct and necessary corollaries” from the principles he previously 
described: 

First,—-To adopt, with respect to all foreign powers, the prin¬ 
ciple of reciprocity.— Second, —To lower the import duties 
upon the goods produced in countries receiving British 
goods upon terms equally favourable.— Third,— To impose 
high or prohibitory duties upon goods, the produce of coun¬ 
tries imposing high or prohibitory duties upon British 
goods. F&urth, —To admit, duty free, all raw materials em¬ 
ployed in the processes of reproduction. [Pp. 47-48] 
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This could hardly constitute the commercial policy recommendations 
of a cosmopolitan free trader. In fact, Torrens was acutely aware of 
the distinction between national and world welfare, for two para¬ 
graphs later he wrote that “unrestricted interchange of commodities, 
between different countries, would increase the wealth of the world.” 
But British welfare was at stake, and he scolded the government for 
having “deprived the country of the advantages which our manufac¬ 
turing superiority would otherwise have secured” and “lowered the 
prices of British goods in foreign markets” (p. 48). In sum, Torrens 
believed that “reciprocity should be the universal rule" (p. 65) and 
that “the sound principle of commercial policy is, to oppose foreign 
tariffs by retaliatory duties, and to lower our import duties in favour 
of those countries which may consent to .trade with us on terms of 
reciprocity” (p. 50). a 

Despite his numerous early publications on this question, Torrens 
was not the first to think through the effects of tariffs on the terms of 
trade. John Stuart Mill wrote at length on the issue in 1829 or 1850, 
but that essay was not published until 1844 3 It was here that Mill 
questioned “whether any country, by its own legislative policy, can 
engross to itself a larger share of the benefits of foreign commerce, 
than would fall to it in the natural or spontaneous course of trade” (p. 
21). He answered affirmatively and explained more clearly than Tor¬ 
rens the advantage of trade taxes if foreign demand is inelastic. Mill 
cautioned that while there are such advantages under certain condi¬ 
tions, “the determining circumstances are of a nature so imperfectly 
ascertainable, that it must be almost impossible to decide with any 
certainty, even after the tax has been imposed, whether we have been 
gainers by it or losers” (pp. 24—25). Furthermore, because the tax 
could eliminate trade and because foreigners can buy from other 

* According to O’Brien (1977), Torrens was also auite active in persuading the Con- j 
servatives, Benjamin Disraeli in particular, to abandon protection in favor of his pro¬ 
gram of reciprocity. In 1850, when controversy still simmered over the course of 
British trade policy, Torrens privately advocated a reintroduction of agricultural pro¬ 
tection on grounds that it was necessary to counter foreign tariffs. Reciprocity, he 
believed, was the only scientific basis on which such protection could be defended. 

* The essay “Of the Laws of Interchange between Nations; and the Distribution of 
the Gains of Commerce among the Countries of the Commercial World" appeared in 
Mill’s Essays on Some Unsettled Questions of Political Economy. This, of course, is the famous 
essay that set down the theory of reciprocal demand as the determinant of the terms of 
trade. In his preface to this collection of essays, Mill states that they have been pub¬ 
lished “under the impression that the controversies excited by Colonel Torrens’ Budget 
have again called the attention of political economists to the discussions of the abstract 

science_It will be seen that opinions identical in principle with those promulgated 

by Colonel Torrens (there would probably be considerable difference as to the extent 
of their practical application) have been held by the writer for more than fifteen yeat* 

(p. v). It should be mentioned that in 1835 in Ireland Montiford Longfield also made 
progress on the theory of tariffs and the terms of trade. However, his ideas drew scant 
attention in Britain, and he played no role in the tariff debates there. 
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countries, he argued that “even on the most selfish principles, there¬ 
fore, the benefit of such a tax is always extremely precarious” (p. 25). 

In discussing reciprocity, Mill distinguished between a protecting 
duty, which encourages a particular branch of domestic industry by 
attracting labor and capital, and a revenue duty, on those products 
not produced at home. A “protecting duty can never be a cause of 
gain, but always and necessarily of loss, to the country imposing it” (p. 
28), Mill wrote; hence they are without justification. But with revenue 
dudes, 

considerations of reciprocity, which are quite unessential 
when the matter in debate is a protecting duty, are of mate¬ 
rial importance when the repeal of dudes of this other de¬ 
scription is discussed. A country cannot be expected to re¬ 
nounce the power of taxing foreigners, unless foreigners will 
in return practise towards itself the same forbearance. The 
only mode in which a country can save itself from being a 
loser by the duties imposed by other countries on its com¬ 
modities, is to impose corresponding duties on theirs. [P. 29] 

Indeed, Mill observed with concern the tariffs of other countries. Of 
the severe protectionist policies in France, the Netherlands, and 
United States, he believed that “these duties, though chiefly injurious 
to the countries imposing them, have also been highly injurious to 
England” (pp. 37-38) by lowering the price of its exports. In a re¬ 
vealing review of Torrens’s policy views, Mill dubbed foreign tariffs as 
“the real source of alarm” (1843, p. 86). 

Thus Mill accepted, indeed originated, the theoretical point that a 
tariff can improve a country's terms of trade. But he exercised great 
restraint in drawing specific policy recommendations for Britain from 
this proposition. His caveat was that Torrens, “as is not unusual with 
him, seems to us to overstate the importance and urgency of a portion 
of his doctrines in their application to the immediate circumstances of 
the country” (p. 86). 

Other economists were not as sympathetic to Torrens’s view as Mill. 
Nassau Senior, who wrote a lengthy critique in 1843, was the most 
visible opponent of Torrens. Unfortunately for his cause, Senior’s 
essay, which some consider as a political move to support unilateral 
free trade, was a weak and meandering response. Despite this, the 
essay does score some telling points. Senior first accused Torrens of 
rejuvenating mercantilism. He then said that Torrens, while high¬ 
lighting the adverse terms of trade impact of reducing tariffs, ignored 
die costs such trade restraints entail: 

Torrens spsumes, first, that a country can exclude foreign 
commodities without diminishing the efficiency of its own 
labour.... 
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It is a great mistake to suppose that a country which rejects 
the territorial division of labour, suffers merely by the 
greater dearness of the commodities which it is forced to 
produce instead of importing them. It incurs a further, and 
in many instances greater, injury—in the general diminution 
of the efficiency of its own industry, occasioned by the misdi¬ 
rection of capital and the diminished division of labour. 
[1843, pp. 12, 14] 

Senior accepted but dismissed Torrens‘s example of a tariff’s im¬ 
pact on the terms of trade: “We believe this to be true; but we believe 
it to be one of those barren truths from which no practical inferences 
can be drawn.... In short, when he [Torrens] seriously urges us to act 
as if his hypothesis represented the actual state of things, we utterly 
dissent from, and repudiate his doctrine” (pp. 36-37). He also chided 
Torrens for assuming that Britain was the innocent victim of foreign 
tariffs when its own tariff barriers were substantial as well. 

Other critics emerged as well. 4 One of the most incisive was Her¬ 
man Merivale (1842, 2:305-11), who restated Torrens’s example of 
Cuba in purely barter terms to focus on the essence of the tariff and 
terms of trade argument. Merivale then introduced a second supplier 
of sugar to Britain, Brazil. If Cuba placed duties on imports from 
Britain, thereby raising the price of its sugar, Britain could simply 
switch its source of supply to “the next cheapest country producing the 
same commodities as Cuba” (2:310). In all, Britain would be hurt only 
in proportion to the gap between Cuba’s original price and Brazil’s 
price of sugar. The trade of Cuba with Britain would be ruined, and 
Brazil would be the real beneficiary of Cuba’s tariff. By allowing com¬ 
petition among Britain’s import suppliers, Merivale demonstrated 
that Torrens exaggerated the impact of foreign tariffs if not all other 
nations increased their tariffs. 

Torrens’s reply was ineffective, admitting that if Merivale’s “as¬ 
sumption bore any resemblance to actual circumstances, the Cuba 
tariff could have a very slender effect in altering the terms of interna¬ 
tional exchange to the disadvantage of England” (1844, p. 358). Tor¬ 
rens was left to assert that his example of all foreign countries increas¬ 
ing their tariffs was more realistic. 

Yet Torrens’s analysis survived the onslaught of other theorists, 
though not without qualification. That he was correct in theory was 
confirmed when the controversy prompted Mill to publish his previ- 

4 One early critic was Thompson (1835a, 18336). Other comments are in Lawson 
(1844, pp. 153-47), McCulloch (1849, pp. 166-68), Norman (1869, pp. 16-29), and 
the three anonymous articles ‘‘Colonel Torrens on Free Trade” (1845), “Reciprocal 
Free Trade" (1843), and “Professor Lawson’s Lectures on Political Economy” (1844). 
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ously written essay on the subject. Other economists gradually if re¬ 
luctantly acknowledged his theoretical point. But as to the policy rec¬ 
ommendations that Torrens considered natural conclusions of his 
analysis, even Mill balked, and the other economists rejected them 
entirely. 

Despite Torrens's warnings, Britain adopted unilateral free trade 
under which it flourished in a growing world economy. The question 
still remains about the welfare effect of free trade on Britain: Did 
tariff reductions, all other things being equal, so adversely affect Brit¬ 
ain’s terms of trade as to make the country worse off? Were Torrens’s 
concerns warranted, and how significant were such effects? 


III. A Model of British Foreign Trade, 1820-46 

Economists have long been fascinated with the international trade 
aspects of Britain’s industrial revolution (see, e.g., Imlah 1958; Davis 
1979; Crouzet 1980; Findlay 1982). Having been the first to undergo 
an industrial revolution, Britain was virtually the sole exporter of 
many manufactured goods. In 1831, over 90 percent of Britain’s 
exports were manufactures, and over half of these were cotton tex¬ 
tiles, with woolens and iron constituting the rest (Crafts 1985, p. 143). 
The era from the 1780s until the mid-1800s was a period of tremen¬ 
dous growth of British exports, in part because of technical progress 
in the export sector. This growth sharply reduced export prices, and 
Britain’s net barter terms of trade declined throughout this period. 
Imports were highly concentrated as well. In 1831, over 70 percent of 
retained imports were raw materials, primarily cotton from the 
United States. The rest consisted of foodstuffs such as sugar, tea, and 
wine. 

This section aims to incorporate available data into an econometric 
model of British foreign trade. As will be seen in Section IV, the basic 
model used to assess the extent to which a terms of trade deteriora¬ 
tion would offset other gains from tariff reduction requires informa¬ 
tion on import and export supply and demand elasticities. There 
being no estimates of these for Britain during this period, an attempt 
will be made here to estimate such elasticities for the relevant time 
period. 5 A simultaneous equation model, similar to that in Goldstein 
and Khan (1978), is employed to account for the endogenous deter¬ 
mination of import and export prices and quantities. Two systems of 
equations are estimated, one related to British exports and the other 
to British imports. Although serious questions will remain about the 

* A large literature on estimating trade elasticities exists. For a survey of recent work 
»»d methodological concerns, see Goldstein and Khan (1984). 
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data and the results are decidedly mixed, partial information is 
gleaned about the general values of the elasticities. 

A. British Exports 

The world’s demand for British exports is specified as 

In^) = aco + <*1 ln(i*X) + a* ln(WPI*) + a 3 ln(F*) + 04 ln(X_i), 

( 1 ) 

where X d - quantity of exports demanded, PX * price of exports, 
WPI* = a trade-weighted wholesale price index of Britain’s trading 
partners, F* ~ the trade-weighted real income of Britain’s trading 
partners, and X_i * lagged export volume. All variables apply to 
Britain, except those with asterisks, which denote foreign variables. 
This relatively standard form is expressed in log-linear terms so that 
ati and a 3 are the price and income elasticities of foreign demand for 
British exports, respectively. 

The supply of exports is specified in a log-linear form as 

ln(X J ) - 0„ + 3, ln(PX) + 0 2 In(WPI) + 0 3 ln(TFP) 

+ p 4 ln(/>M-,) + 0 5 ln(PX_,). (i ' 

where X s - quantity of exports supplied, PX - price of exports, WPI 
* domestic wholesale price index, TFP * index of total factor pro¬ 
ductivity, PM_ j = lagged import prices, and PX- j = lagged export 
prices. Exporters are assumed to react positively to the price of ex¬ 
ports beyond domestic prices. In addition, technical progress in the 
export sector will result in an outward shifting supply of exports. 
Because many of Britain’s exports required imported intermediate 
goods, a lagged import price term is included. 

Equation (2) can be solved for the price of exports to yield 

in,px) = n 0 + n, inm + n, in<wpi> + n, i„<tfp) 

+ n 4 ln(PM_ 1 ) + n 5 ln(PX_i), 

where IIo — ~0o/0i> Ilj — 1/0 1 , n 2 = —0 2 /0j, II 3 = —0s/0j,II 4 = 
- 04/0i, and II 5 — - 05/0] ■ Since 0i is assumed to be greater than zero, 
we would expect that II 1 > 0, II 2 > 0, II 3 < 0, II 4 > 0, and II 5 > 0 . As 
in Goldstein and Khan (1978), the parameter 0i, which is the price 
elasticity of export supply, can be obtained from (3) by calculating 
(III)' 1 . 

Equations (1) and (3) constitute the simultaneous equation system 
incorporating two endogenous variables. Estimates of the structural 
parameters can be obtained by estimating these two equations using 
two-stage least squares. 
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An issue that immediately arises is whether the data are available to 
carry out the estimation. Fortunately, annual time-series data on the 
volume and price of British exports for this period have been care¬ 
fully calculated by Imlah (1958). 6 A wholesale price index for Britain 
is available in Mitchell (1962). The total factor productivity index was 
constructed as a weighted average of the total factor productivity in 
export industries, and its derivation is described in the Appendix. 

Detailed information on many of Britain’s trading partners is 
nonexistent. Such data are available only for the United States and 
France, so foreign variables must be confined to these two countries. 
The United States and France were significant trading partners of 
Britain and accounted for over a quarter of Britain’s exports over this 
period. However, important markets in Asia, Africa, and elsewhere 
are ignored. These omissions bias Britain’s export demand elasticity 
downward. 

Gross national product estimates for the United States are available 
in Berry (1968). Gross domestic product estimates foi France are 
available in Maddison (1982). A wholesale price index for the United 
States is available in U.S. Department of Commerce (1975) and for 
France in Mitchell (1980). Trade shares are in Mitchell (1962). 

B. British Imports 

The market for British imports is modeled in a way similar to that of 
the export market. British import demand is specified as 

ln(M rf ) * a 0 + In(PAf) + <x 2 ln(WPI) + a, ln(/P) + a 4 ln(M_i), 

(4) 

where M d — quantity of imports demanded, PM = price of imports, 
IP - index of industrial production, and M _ 1 = lagged import vol¬ 
ume. Import demand is again supposed to be a negative function of 
the price of imports, controlling for domestic prices, and a positive 
function of industrial production, which is used as a proxy for income 
(see discussion below). 

The import supply equation follows a form similar to the export 
supply equation: 

ln(AP) - b 0 + bi ln(PM) + fr 2 (WPI*) + ln(TFP*) 

(5) 

+ b 4 \n(PM-i). 


6 The export price index is net of foreign tariffs, the proper one for export supply. 
Because the same index is also used for export demand, we must assume that foreign 
tariffs were relatively unchanged over this time period. The import price index is net of 
British tariffs, but Imlah shows that the tariffs were relatively constant. 
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Equation (5) can be solved for the price of imports to yield 

In(PAf) * ©o + 0 a ln(AT) + © 8 ln(WPI*) + © s ln(TFP*) 

+ © 4 ln(PM_i), (6) 

where © 0 * -bo/b u © a = 1/ij, © 2 = -b^/bi, © 9 = -b^b x , and © 4 = 
-b<fb x . 

Equations (4) and (6) constitute the simultaneous equation system 
for British imports, and estimates of the structural parameters can be 
obtained by estimating these two equations using two-stage least 
squares. 

The prices and quantities of British imports are given in Imlah 
(1958). An industrial production index was constructed by Hoffman 
(given in Mitchell [1962]) and is used as a proxy for real income 
because annual income data do not exist. According to Harley (1982), 
it turns out to be a good proxy for what estimates there are of gross 
national product. 7 The TFP* index is a weighted average of the total 
factor productivity of the foreign supplier’s (i.e., American and 
French) export industries, and its construction is described in the 
Appendix. 

C. Estimation 

Equations (1), (3), (4), and (6) were estimated using two-stage least 
squares with data for 1820-46. The instruments consisted of all exog¬ 
enous variables for each system. The main results, which contain the 
information needed for the exercise in the next section, are presented 
in table 1. Unfortunately, the results are quite mixed, although the 
coefficients are relatively invariant over various specifications. It is 
possible to recover some information about the general range of the 
two most important elasticities, those for export and import demand. 
Many of the coefficients have the predicted sign, and these results are 
not out of line with those estimated by Goldstein and Khan (1978) for 
the modem era. Foreign demand for British exports is shown to be 
just over unit elastic, but one cannot reject the hypothesis that it is 
inelastic. As Britain was the only major exporter of manufactured 
goods, this finding is plausible. Because the result is perched at unit 
elasticity, it is hard to evaluate whether McCloskey (1980) and Crafts 
(1985) were correct in assuming that export demand was price inelas¬ 
tic or whether Mill (1844) was correct in assuming that it was price 

7 Harley (1982) examined British national income in detail and concluded that the 
Hoffman index was a misleading indicator of income during the early stages of the 
industrial revolution (1780-1800), but that it is quite accurate for the later period 
1815-41. 
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TABLE 1 

Regression Results 


i»53 


Export demand: 

ln(X) - 5.02 - 1.1 ln(PX) + .48 ln(WPI*) - .07 ln(V*) + .55 ln(X_,) 
(1.8) (.41) (.57) (.15) (.17) 

fi* - .94, F - 109.4 


Export supply: 

ln(PX) - 9.95 + .70 ln(X) + .11 In(WPI) - 2.59 ln(TFP) + .02 In(PM.,) 
(9.42) (1.21) (.24) (5.16) (.24) 


+ .77 ln(PX_i) 

(.44) 

R * = .92, F « 55.5 


Import demand: 

ln(M) - .20 - .98 ln(PM) + 1.55 In(WPI) + 1.85 ln(K) - .78 ln(Af_„) 
(1.47) (.54) (.27) (.27) (.26) 

R* - .93, F = 95.9 


Import supply: 

In(PAf) * 5.41 - .09 ln(Af) + ,701n(WPI*) - .48 ln(TFP*) + .16 In (PM.,) 
(2.57) (.15) (.22) (.65) (.21) 


R 1 = .75, F - 16.2 


Nan.—Standard error* m in pnmxhoa. Euimata «T for the yean 1820-46. 


elastic. But because the estimated elasticity of export demand may be 
biased downward because of a lack of foreign data, it might be safe to 
conclude that British export demand was more on the elastic side. 

The British export supply equation encounters problems of getting 
significant estimates. The coefficient does hint that supply was slighdy 
elastic (1.4 because Pi = l/IIi). For the modem period, Goldstein and 
Khan also found that export supply elasticities were in the range from 
1 to 6 when estimated by simultaneous methods. 

The import demand equation indicates that British import demand 
was barely inelastic ( - 0.98), but one cannot reject the hypothesis that 
it is elastic. This result is in accord with Wright’s (1971) finding that 
British import demand for cotton was inelastic. 8 The income (or in¬ 
dustrial production) effect on British imports is also found to be quite 
strong. 

The import supply equation also poses a problem in terms of get¬ 
ting significant estimates. The price coefficient on import supply is 
insignificant and carries the wrong sign. For the calculations below, I 
take the import supply elasticity to be 1.5, which is close to that of 
British export supply. This value suggests that Britain had some 

* Wright gave a range between -0.31 and -0.65. The estimate here is presumably 
* nore elastic because it include* total imports, which coven commodities such as tea and 
•ugar. 
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monopsonist power, which is plausible because it was such a large 
purchaser of the world’s raw materials. It also is near Wright’s finding 
(albeit econometrically very poor as well) of an inelastic supply of 
cotton. In the results calculated below, a range of import supply elas¬ 
ticities will be used to determine how important this is for Britain’s 
welfare. Fortunately, as will be seen, the results are consistent over a 
wide range of import supply elasticities. 

IV. Did Britain Lose from Free Trade? 

This section analyzes the welfare effects of a British tariff reduction, 
with allowances for its impact on the terms of trade. The appropriate 
model to evaluate such a welfare change has been developed by 
Basevi (1968) and Walker (1969). The key to the model is the deriva¬ 
tion of an expression for the change in relative prices (terms of trade) 
needed to restore the balance of trade, disturbed by the tariff cut, to 
its original value . 9 This change in relative prices is endogenous in the 
sense that it depends on the underlying trade elasticities, which indi¬ 
cate the degree of the country’s international market power, and the 
amount of the tariff reduction. Using this expression, the model then 
permits an evaluation of national welfare taking into account the 
tariff’s effect on the terms of trade. The reader is referred to Basevi 
(1968) and Walker (1969) for details . 10 

Following Walker (1969), the change in aggregate welfare from a 
tariff reduction can be divided into that from the terms of trade 
deterioration and that from the efficiency gain. The approximations 
are, respectively, 

W, = -dp m ■ M, (7) 

W, - .5 • (/ - dp m )dM + dp x {X + .5 • dX), (8) 

where dp m and dp x are the change in import and export prices, t is the 
initial tariff, and M and dM and X and dX are the initial values and the 
changes in import and export volume, respectively. Embedded in 
the dp,*, dp„ dM, and dX terms are the trade elasticities estimated in 
Section III and the expression for the change in the terms of trade. 
Equation (7) captures the adverse terms of trade impact from a tariff 

9 Such an expression would capture exactly what Torrens was concerned about. As 
Viner (1987, p. 298) explains, “Torrens, in 1841-42, in the course of an attempt to 
demonstrate that retaliation against foreign tariffs would be beneficial to England even 
if such retaliation did not lead foreign countries to reduce their tariffs, placed main 
emphasis on the role of relative price changes in adjusting the international balances to tariff 
changes" (emphasis added). 

10 The only modification made to Basevi's derivation was to allow for discrete tariff 
changes since he examines the case of tariff abolition alone. 
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reduction as the increase in import prices (i.e., part of the lost tariff 
revenue) evaluated over the initial volume of imports. Equation (8) 
captures the increase in consumer surplus from added consumption 
of importables and the increase in producer surplus among exporters 
who expand production, that is, the standard consumption and pro¬ 
duction gains from tariff reduction. The relative size of W t as against 
IV, determines the overall welfare impact of the tariff reduction. 

In addition to the elasticities, this framework requires information 
on the uniform tariff equivalent. As McCloskey (1980) has discussed, 
the bulk of British tariff revenue was collected against consumer 
items such as sugar, tea, tobacco, wine, and coffee at the time of the 
repeal of the Corn Laws. Relatively little was collected against goods 
used as production inputs, such as cotton and lumber. Therefore, 
nominal tariff rates can probably be used with some justification in 
lieu of effective tariff rates. According to McCloskey’s calculations, 
the British tariff was an average 35 percent in 1841. Because Britain 
approached free trade gradually, the rate was still 25 percent in 1854 
if 1841 commodity weights are retained. 11 For purposes of the calcu¬ 
lations here, 1 take the tariff to be 0.35 and use the 1846 values of 
imports and exports for the baseline calibration. These data along 
with the elasticities are used to address three questions: (1) What was 
the impact on British welfare of a small tariff reduction, say on the 
magnitude of that engendered by the repeal of the Corn Laws? (2) 
What was the welfare impact of a larger move to free trade? (3) What 
do these results say about whether Britain gained or lost from free 
trade overall? 

A. Welfare Change from a Small Unilateral Tariff Cut 

The first experiment is of a uniform four-percentage-point tariff re¬ 
duction on all commodities, or about an 11 percent tariff cut. This 
would be a large tariff change for 1 year and is not exactly equivalent 
to the repeal of the Corn Laws. The model here is highly aggregated 
and cannot evaluate in detail cuts in a particular tariff, as Williamson’s 
(1986) model can. In addition, from McCloskey’s data, elimination of 
tariffs on wheat and other grains would reduce the average tariff by 
only one percentage point (using the old commodity weights). A four- 
percentage-point uniform cut would be illustrative of tariff reduc¬ 
tions taking place over a few years because its overall size is a bit larger 
than that achieved by the elimination of the Corn Laws. 


Data in Imlah (1958, p. 159) suggest that tariff revenue as a fraction of dutiable 
“itports fell continuously from 1842 until about 1875, when the ratio leveled off at just 
over 5 percent. 
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The results of a unilateral reduction in British tariffs of this mag¬ 
nitude are presented in table 2 in example 1. They show that the 
welfare effect of a unilateral British tariff cut in the mid-1840s would 
be a loss of £2 million. This amounts at most to roughly 0.4 percent of 
British national income, using Crafts (1987) for national income in 
the 1840s. By contrast, these results are large relative to the losses 
from tariff elimination in the United States in 1960 as examined by 
Basevi. The adverse shift in Britain’s terms of trade is not insignificant 
either, on the order of 5.5 percent. 

Table 2 also presents results for five other situations using a range 
of elasticity values to test for the sensitivity of the results. The second 
example shows that a different import supply elasticity, such as the 
larger value of 3,0, reduces the overall welfare impact of tariff reduc¬ 
tion negligibly. The welfare cost is reduced with a more elastic foreign 
supply curve because it means that the country has less monopsonist 
power to squeeze import prices. The third and fourth examples dem¬ 
onstrate that the welfare cost is shrunk further when demand for 
British exports is assumed to be more elastic. These scenarios show 
that with the export demand elasticity at -1.5 and the import supply 
elasticity at 1.5 and 3.0, the terms of trade fall by 3.2 percent and 
welfare is reduced by about £1.7 million or 0.35 percent of national 
income. Williamson (1986), in modifying his model to account for 
Britain’s impact on world markets, assumes the export demand elas¬ 
ticity facing Britain to be — 2.5 and the import supply elasticity to be 
highly inelastic at 0.5 (example 5). The loss here is essentially the same 
as in the other examples. 

For a variety of plausible British foreign trade elasticities, a unilat¬ 
eral tariff reduction is shown to reduce national income. The finding 
of a negative welfare effect is robust over these elasticities, and the 
terms of trade deterioration dominates the efficiency gain for every 
example. 12 Questions remain about the magnitude of the welfare re¬ 
duction. The elasticities derived in Section III imply a relatively siz¬ 
able loss in national income for such a moderate tariff reduction. This 
is primarily the result of an inelastic export demand. Especially when 
export demand is assumed to be more elastic, as it might be after 
downward biases are eliminated, Britain’s loss is greatly reduced. 

B. Welfare Change from a Large Tariff Reduction 

While one must be extremely cautious about using elasticity estimates 
for large price changes, an intriguing and irresistible question con* 

** Basevi uses a much more elastic export demand elasticity and finds a similar 
negative welfare effect. 
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CCtm the size of the welfare change if Britain had rushed immediately 
to free trade. Harley and McCJoskey (1981) employ a simple calculi 
turn to conclude that Britain would have lost at most 6 percent of 
national income from a reduction tn its tartffof21 percentage points, 
this being the difference between Britain’s tariff in 1841 and 1881. If 
the elasticity estimates from above are used in conjunction with a 21- 
percentage-point tariff reduction (example 6 in table 2), we find that 
the loss is approximately 2 percent of national income. IF the export 
demand elasticity is assumed to be more elastic (example 7), then this 
loss drops to about 1.6 percent of national income. These are substan¬ 
tially lower than Harley and McCloskey’s estimate of the loss. Thus 
according to this model, the largest plausible British loss from unilat¬ 
eral free trade is significantly smaller than previously suggested. 

C. Did Britain Lose from Free Trade f 

The results presented above may help to answer questions raised by 
Torrens and by economic historians. They appear to confirm the 
judgment that adverse terms of trade shifts would outweigh efficiency 
gains from a British tariff reduction. They are interesting and impor¬ 
tant in their own right but may be misleading if one jumps from these 
findings to the conclusion that Britain lost from free trade during the 
mid-nineteenth century. Aside from the fact that the model is purely 
static and that elasticities change over time and numerous other legiti¬ 
mate concerns, the results above isolate only one ceteris paribus com¬ 
parative static, namely, tariff reductions undertaken unilaterally. 

Britain did embrace tariff reforms without negotiating with other 
countries, until the Cobden-Chevalier treaty of 1860. But this does 
not mean that other countries kept their policies fixed after Britain 
liberalized. In fact, many European nations began reform of their 
tariff codes shortly after the British actions. This consideration sug¬ 
gests that unilateral free trade worked, in some sense, in its demon¬ 
stration effect. While Torrens and others urged delay until other 
countries were prepared to reduce their tariffs as well, unilateralists 
such as Lord Overstone were confident that “other countries witness¬ 
ing our prosperity will find it necessary to follow our example” 
(O’Brien 1971, 3:1458). And the unilateralists were partially right. As 
Britain eased into free trade from the 1840s and thereafter, Europe 
itself gradually moved in that direction too. 18 


'* As Kindleberger indicates at the end of hh essay on the rise of free trade in 
nineteenth-century Europe, the “movement to free trade in the 1850's in the Nether¬ 
lands, Belgium, Spain, Portugal, Denmark, Norway and Sweden, along with the (other! 
countries discussed in detail, suggests the possibility that Europe as a whole was 
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TABLES 

Vita*** Enter or Barron and Fouicn Taiwi Rxductioks 
(Elasticity Assumptions and Initial \ak«) 



Example 1 

Example 8 

Example 9 

Export demand 

-1.1 

- 1.1 

-2 

Export supply 

1.43 

1.43 

1.43 

Import demand 

-.98 

-.98 

- 93 

Import supply 

1.5 

1.5 

2 

Initial tariff 

.35 

.35 

35 

Tariff change 

-.04 

-.04 

-.04 

Initial foreign tariff 

0 

.35 

.35 

Foreign tariff change 

0 

-.043 

-.027 

Import value 

78.1 

78.1 

78.1 

Export value 

57.8 

57.8 

57.8 

Terms of trade (- ) 

.0364 

.0051 

.0086 

w, 

-2.95 

-1.475 

-1.481 

W, 

.951 

1.494 

1.486 

Net welfare effect 

- 2.00 

.019 

.005 


Non.—Figure* maty not add because of rounding. Value* are in millions of pounds sterling. 


This motivation was not prevalent in America, but here as well we 
find tariff cuts coinciding with those of the British. The Tariff Act of 
1846, enacted precisely the year the Corn Laws were abolished, sub¬ 
stantially reduced the U.S. tariff, with the average duty falling from 
33 percent to 24 percent (Rabbeno 1895, p. 185). Tariffs were further 
cut in 1857, but the Civil War interrupted this move to freer trade. 

If foreign tariff reductions were taken into account, Britain might 
well have been better off in the context of this model. Foreign tariff 
reductions would mitigate the deterioration in the terms of trade by 
increasing foreign demand for British goods. Following Walker 
(1969), the model can be extended to allow foreign tariff reductions. 
As is shown in table 3, incorporating foreign tariffs makes an enor¬ 
mous difference. The first example shows that previous results ig¬ 
nored foreign tariffs. The second scenario (example 8) uses all the 
original parameters and, without guidance on the size of foreign 
tariffs facing Britain around 1846, assumes solely for purposes of 
illustration that foreign tariffs were also 35 percent. If the British 
tariff were reduced by four percentage points and the average for¬ 
eign tariff were reduced by slightly more, Britain would be better off 
since the gain from improved resource allocation would just offset the 


motivated by ideological considerations rather than [domestic] economic interests.. -. 
Manchester and the English political economists persuaded Britain which persuaded 
Europe, by precept and example” (1975, pp. 59-51). Of course, European countries 
to increase thtrir tariffs in the 1880s and 1890s, but by then Britain’s dominance 
m world markets had eroded considerably. 
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loss from inferior terms of trade. A final example demonstrates that 
an even smaller foreign tariff reduction combined with a more elastic 
export demand and export supply elasticity would also allow the Brit¬ 
ish to break even. Examples can easily be constructed that result in 
improvements in Britain's terms of trade. 14 

Whether Britain was indeed made better off as a consequence of 
foreign tariff reductions is still a matter of speculation because of 
insufficient information on the extent of those reductions. But the 
issue of foreign countries following Britain in liberalizing their trade 
policies has rarely been a consideration in the debate surrounding the 
welfare effects of British free trade. As these examples show, how¬ 
ever, such considerations are essential if we wish to judge whether or 
not Britain ultimately reaped benefits from free trade in the mid¬ 
nineteenth century. 

The fact that foreign tariffs were liberalized without bargaining was 
largely fortuitous to Britain. Ironically, Britain finally decided to pur¬ 
sue a unilateral free-trade policy only after many unsuccessful years 
of seeking reciprocity treaties with the continent. One might suggest 
that in terms of optimal policy from a purely national point of view, 
Britain may have timed things well: trade restraints in the 1820s and 
1830s, then gradually freer trade in the 1840s and thereafter as its 
monopoly position began to erode. After all, Britain’s capacity over 
time to derive benefits from the tariff, by which Britain in essence 
imposed a scarcity of manufactured goods on world markets, was 
limited. As the industrial revolution spread to other countries, a 
spread accelerated by its tariff, Britain confronted increasing foreign 
competition, and its position in world trade became vastly different. 

V. Conclusions 

This paper has essentially been an empirical exercise in the history of 
economic thought: to resolve a debate about economic theory and its 
application to Britain in the nineteenth century. The simple exercise 
made here indicates that Torrens’s concerns were not mere fancy. All 
other things being equal, a purely unilateral tariff reduction was 
found to have reduced Britain’s welfare, although the overall static 
welfare cost of a large reduction was found to be much lower than 
previously suggested. But these results alone cannot answer the ques¬ 
tion whether Britain won or lost from free trade. Introducing foreign 

14 These results should not be too surprising to those familiar with the modern 
literature on trade liberalization. Most studies show that large trading regions suffer 
welfare losses with unilateral tariff reductions and welfare gains with multilateral tariff 
reductions (see Whalley 1985). 
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tariff reductions into the analysis makes it quite possible that Britain 
did, in fact, benefit from free trade. 


Appendix 

Data and Sources 

This Appendix goes into greater detail about the data used for the estima¬ 
tions in Section III and the methodology followed for creating some of the 
data sets. 

A. British Exports 

Data on British export volume and export prices are in Imlah (1958, pp. 94- 
96). A British wholesale price index for domestic commodities is in Mitchell 
(1962, p. 470). The foreign wholesale price index is an export-share weighted 
average of U.S. and French wholesale prices. Some estimations of trade elas¬ 
ticities use a weighted average of the export prices of the country's trading 
partners as a deflator. These data are not available, and the insertion of the 
wholesale price index is a justifiable substitute. The export weights were 
calculated from Mitchell (1962, pp. 313-14). The U.S. wholesale price series 
is from the U.S. Department of Commerce (1975, 1:201), and France's 
wholesale prices are in Mitchell (1980, p. 772). For foreign income, real gross 
nadonal product estimates for the United States are in Berry (1968), and real 
gross domestic product estimates for France are in Maddison (1982, p. 169). 

Another element of the British export supply equation is the total factor 
productivity index for the export sector. McCloskey (1981) has sectoral esti¬ 
mates of productivity growth, which have been disputed by Crafts (1985, 
1987), who provides his own estimates. On the basis of the judgments of 
Mokyr (1987) and Williamson (1987), I have chosen to rely on the estimates 
by McCloskey. Hence, I have adopted McCloskey’s methodology and have 
recalculated the productivity estimates for the time period at issue in this 
paper. 

McCloskey borrows the method derived by Gollop and Jorgenson (1980), 
in which total factor productivity in a sector is defined as changes in input 
prices, weighted by their share in costs, subtracted from the change in output 
price. Thus 

TFPj, «* tkpi - A(pxt)$xi ~ b(pLifru ~ A(pjciW>ic«. 

where TFP* is the change in total factor productivity for industry i in year t, 
and Ap, is the change in the output price of sector i minus the changes in 
input prices (intermediate input, labor input, and capital input) weighted by 
their share in costs. This was done for the major export sector of the British 
economy, cotton textiles. Prices of cotton textiles are calculated in Sandberg 
(1974, p. 239). Raw cotton prices in Britain and wages for textile labor are in 
Mitchell (1962, pp. 491 and 349, respectively). The price of spindles has been 
imputed from von Tunzelman (1978, p. 214), and factor shares are in Ellison 
(1886, p. 68 ff.). The average total factor productivity in cotton textiles for 
1820—46 grew at a 2.0 percent annual rate, dose to but lower than McClos- 
key's estimate of 2.6 percent for 1780—1860, which may indicate that textile 
productivity slowed into the mid-nineteenth century. 
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The remaining export sectors, which are less important in terms of weight, 
are woolens and worsteds, other manufactures, and agricultural goods. Be¬ 
cause the previous relationship held fairly close and because of a lack of data, 
1 assumed that McCloskey’s estimate of total factor productivity growth in 
woolens and worsteds held for the 1820-46 period. Productivity growth in 
other manufactures and in agriculture are in Williamson (1985, p. 90), and 
for this small group 1 constructed an annual series that assumed a constant 
growth rate. 

Once total factor productivity estimates were derived, they were averaged 
over 10-year periods (approximately) and weighted by their share in exports. 
The results were not sensitive to the time frame chosen. Export shares are 
given in Davis (1979, p. 15). 

B. British Imports 

Import volumes and prices are in Imlah (1958, pp. 94-95). British industrial 
production is the Hoffman index, reprinted in Mitchell (1962, p. 271). The 
only other new statistic is an index of total factor productivity in the export 
sector of Britain’s suppliers. Most U.S. exports consisted of raw cotton and 
other agricultural expons. An estimate of total factor productivity in this 
sector is given in Gallman (1972). Nonagriculturai expons from the United 
States are assumed to have a total factor productivity growth the same as 
David’s (1977) estimate of U.S. total factor productivity for this period as a 
whole. France’s total factor productivity growth poses a problem. On the basis 
of evidence in Levy-Leboyer (1978) and Auffret et al. (1981), a total factor 
productivity growth in exports of 0.5 percent annually was assumed. Chang¬ 
ing this assumption to 0.75 alters the results of table 1 negligibly. 
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Altruism and Tima Consistency: The 
Economics of Fait Accompli 


Assar Lindbeck and Jdrgen W. Weibul! 

Institute for International Economic Studies 


This paper analyzes the strategic and intertemporal interaction be¬ 
tween two economic agents who have “overlapping” concerns, such 
as altruistic concerns for each other’s welfare. The agents may be two 
individuals, a social bureau and a client, or two units in an organiza¬ 
tion. We show how the presence of such common concerns may lead 
to socially inefficient outcomes, in which one economic agent "free- 
rides” on the other’s concern. We also briefly discuss how this 
inefficiency and free-riding, in the context of interaction between 
individuals, might be mitigated by compulsory social security sys¬ 
tems. As another example we interpret the inefficiency in terms of 
Kornai’s “soft budget constraints" within organizations. 


I. Introduction 

The presence of altruism, in the sense of concern for others’ welfare, 
may easily lead to socially inefficient outcomes in an intertemporal 
setting with strategic behavior, despite the fact that both the donor 
and the recipient are rational and well informed about each other’s 
preferences and endowments, and all choices are voluntary . 1 The 
source of the inefficiency is the recipient’s strategic incentive to 


An earlier version of this paper was presented at the Institute for International 
Economic Studies, December 1986, and at the conference of the European Economic 
Association in Copenhagen, August 1987 (Lindbeck and Weibull 1987). We want to 
thank Wolfgang Leininger, Torsten Persson, Lars E. O. Svensson, and an anonymous 
referee for helpful comments. The research was supported by the Bank of Sweden 
Tercentenary Foundation. 

„ “True" altruism, which is the topic of this paper, should be distinguished from 
cooperative egoism,” i.e., help to others in the expectation of being helped back in the 
future (see, e.g„ Hammond 1975; Kura 1978; Wtmrobe 1981; Axelrod 1984). 
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“squander" in an early period in order to subsequently receive more 
resources from the other agent, that is, to “free-ride” on his concern. 
A “threat" by a potential donor not to give additional support to an 
agent because he squanders is not credible if the recipient knows that, 
ex post, it will be in the donor’s (altruistic) interest to give such addi¬ 
tional support. A rational donor anticipates this possibility of “fait 
accompli” and may hence reserve some additional resources for this 
purpose. 

The sense in which this outcome is inefficient is that both individ¬ 
uals' welfare could be improved if at the outset the donor could make 
a binding commitment to the given support. The reason why the 
recipient would gain by such an arrangement is that, although his 
total resources would not be enhanced by the commitment, he could 
then allocate them more efficiently over time. By doing so, he in¬ 
creases not only his own welfare but also that of the donor, via the 
donor’s “true” altruism. In other words, there is a problem of time 
consistency: intertemporal equilibrium may be in conflict with 
(Pareto) optimality . 2 The same problem may arise also for types of 
common concern other than altruism. 

The general, multiperiod and multiperson problem of strategic- 
interaction between altruistic economic agents is quite complex. It is 
possible, however, to highlight the basic incentive mechanisms al¬ 
ready in a model with two periods and two agents, who may give 
transfers in the second period only. We therefore restrict the analysis 
to this simple case. The two agents may be individuals who have more 
or less “altruistic" preferences toward each other. Alternatively, they 
may be agents who have some other type of common concern, as in 
the case of a social bureau that cares about the well-being of a client or 
two units in an organization with overlapping objectives. For the sake 
of definiteness, however, we will develop the model mainly in terms of 
altruism between two individuals. 

The main result in our study is that an intertemporal (subgame 
perfect) equilibrium is (Pareto) inefficient whenever a transfer is 
given and the recipient does some saving for himself. Moreover, the 
recipient then is a “free-rider” in the sense that his strategic behavior 
induces the donor to give larger support than he would have given if 
he had had the possibility to precommit himself to a support of his 
own choice. 

An informal discussion of this type of incentive problem, in the 

* The concept of time consistency, as currently used in macroeconomics, can be 
interpreted in (at least) three different ways, depending on the stringency of the under¬ 
lying equilibrium concept. While the pioneering study by Kydland and Prescott (1977) 
requires only Nash equilibrium, later studies usually require (at least) subgame perfect 
equilibrium (see McTaggart and Saiant 1986). 
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context of one altruistic and one selfish individual, has been pursued 
earlier by Buchanan (1975A). S However, in contrast to his verbal dis¬ 
cussion, the present paper provides a formal, game-theoretic analysis. 
In fact, this analysis shows that inefficiency can arise even when both 
individuals are equally altruistic toward each other, a case not consid¬ 
ered by Buchanan. 

Our result is related to, but different from, Becker’s (1974, 1976) 
“rotten kid theorem.” Becker assumes that one agent (the “father”) is 
altruistic, while another (the “rotten kid”) is purely selfish, and that 
the former can transfer resources to the latter after the rotten kid has 
taken his actions, which affect family income. In this asymmetric set¬ 
ting, Becker argues that the rotten kid, despite his selfishness, will act 
in the interest of the whole family. 4 The present analysis, by contrast, 
establishes inefficiency in an intertemporally symmetric setting with or 
without altruistic symmetry between the agents. Since our analysis 
does not require the kid to be rotten, only rational, one may baptize it 
the “smart kid theorem." 

The paper is organized as follows. The basic assumptions of the 
model are given in Section II. Section III is restricted to the special 
case of Cobb-Douglas preferences, while Section IV provides a more 
rigorous and general game-theoretic setup. Questions of efficiency 
and free-riding are analyzed in Section V, while Section VI draws 
some general conclusions from the analysis and suggests some reme¬ 
dies to the inefficiency and free-rider problem under consideration. 
(Mathematical derivations and some extensions of the model are 
given in Lindbeck and Weibull [1987].) 


II. The Model 

Suppose that there are two individuals, 1 and 2, who both live in 
periods t - 1 and 1 = 2 and whose (total) welfare, U 1 and U%, can be 
written in the following separable but nested form: 

Uj = Ui(c,) + a iUj for i = 1, 2, j ¥* i. (1) 

Here d - (c^, c«), where c*, is the consumption of individual 1 in 
period t. The “individualistic utility” from consumption, u,(c,), is as¬ 
sumed to be additively separable, 

Ui(d) = Uj,(C,j) + U,2(C,2), * = 1,2, (2) 

where the four functions u,, have the usual mathematical properties 

* Vartan (1982) has discussed a simple numerical example along the same lines as 
Buchanan. 

* See Becker (1974, p. 1076). For further discussion, analysis, and criticism of his 

see Becker (1976), Hirshleifer (1977), Wintrobe (1981), and Bergstrom (1987o). 
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of utility functions. 4 5 Each parameter a* » 0 represents the altruistic 
concern of individual i for the welfare of the other. 

Hence, individual i cares not only about j’s individualistic utility 
from consumption (uj) but about the latter’s total welfare (Oj), which 
also comprises j's altruistic concern for ». Thus there is an infinite 
sequence of mutual concerns. However, if a t a 2 1, then one may 
solve equation (l) for (J\ and C/ 2 , hence obtaining (1 - ot]a 2 )l/j = 
u,(c,) + a iUj(Cj). Thus equation (1) then in fact defines U\ and (/ 2 as 
real-valued functions of the consumption vector c = (cj, c 2 ). To avoid 
“bizarre" behavior, we assume that 

a i a 2 < l- 6 7 (3) 

Consequently, in the present model, the behavior of individuals who 
care about the total welfare of others, including also others’ altruistic 
concerns, is, under condition (5), identical to the behavior of individ¬ 
uals who are concerned only about other citizens’ individualistic util¬ 
ity. Since the positive and constant factor (1 - Oia 2 ) is behaviorally 
irrelevant, we will subsequendy represent the individual’s preferences 
as utility functions Ui defined over consumption allocations: 

Vi( c) = Md) + aiUj(Cj), (4) 

where c = (c u , c l2 , c 21 , c 22 ). 

Note that altruism, as represented in this equation, is a special case 
of common concerns, in the sense that both individuals have prefer¬ 
ences over the same vectors (in this case vectors c), and each of them 
associates positive marginal utilities with every component. It is this 
“overlap” of concern that is the basic reason for inefficiency. 

As for the resource constraints of the two individuals, let w, > 0 be 
the initial wealth holding, or endowment, of individual * at the begin¬ 
ning of the first period. Abstracting from interest earnings, we take 
the resources available at the beginning of the second period, a,, to be 
the amount saved: 

Oj * a>i - cn. (5) 

The strategic interaction between the two individuals is supposed to 
take place as follows. At the first stage both individuals independently 
choose their levels of first-period consumption and plan their second- 

4 More precisely, each function u it : R + R is continuous and twice continuously 

differentiable for positive arguments, with tsj, > 0, u," < 0, and ui,(c„) -* » as c„ -* 0. 

* Drazen (1978, pp. 511-12) analyzes altruistic gifts between generations and uses a 
similar constraint on the degree of mutual altruism. If instead a,ai > 1, then each 
individual i strives to mmmiu u, + a,u,. Bergstrom (19875) analyzes cardinal represen¬ 
tations of a more general class of intertwined altruistic preferences, including this 
bizarre case. 

7 For an extension of the analysis to the case in which the first-period income u 
endogenous and depends on labor supply, see Undbeck and Weibull (1987). 
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period consumption and transfer, with each other’s preferences and 
initial endowments as common knowledge. At the second stage, they 
observe the resulting “state" vector a = (oj, a 2 ), and each individual i 
now decides how much, U, of his current resources (a,) to give to the 
other (where 0 < tj < «j). The individuals also make this second 
decision independently of each other. Consumption in the second 
period is then 

cn = ai + tj - ti, jft i. (6) 

III. Cobb-Douglas Prefer e nces 

Before we go to a more general and rigorous game-theoretic treat¬ 
ment, let us examine the special case in which the two individuals have 
identical Cobb-Douglas preferences for consumption: u,(r,) = log(c,j) 
+ log(c i2 )- 

Looking at the second stage first, we note that the pattern of trans¬ 
fers between the two individuals depends on their assets, and a 2 , at 
the beginning of that period. Before any transfer is made, each indi¬ 
vidual i derives marginal utility 1 /a, from his own assets and a,/a 7 from 
those of the other. Hence, if ajai > a 2 , then individual 1 can increase 
his (total) welfare by giving a transfer to individual 2, and vice versa if 
a 2 tt 2 > oi. The “state space” A can accordingly be partitioned into 
three regions. A,. A 2 , and A 0 , where A 0 is the residual set (see fig. 1). 


*2 



Fig. I.—The subset* A 0 , A„ and At in the Cobb-Douglas case 
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If a * («i, o 2 ) lies in A, (i 0), then it turns out to be optimal for 
individual i to make such a transfer toj that their joint resources, a, + 
« 2 > an* split into shares 1/(1 + at,) and a,/(l + ad, the first for himself 
and the second for the other. 8 Moreover, such an optimal transfer can 
be shown to constitute the unique Nash equilibrium in the second- 
period interaction, for any given state a in A (including zero transfers 
in Ao). Hence, if the two individuals save oj and 02 , respectively, for 
the second period and the corresponding equilibrium transfers are 
given in the second period, then the utility of individual t is 


*.( 01 . 02 ) = iog(u» ( - a,) + a, log (to, - a j ) 

[ log(o,) + a, log(fflj) for a G A,, (7) 

+ “■ lo «(TTf) for * e 


for i - 1, 2 and j ¥* i; compare equations (5) and (6). 

If, in period 1, the two (rational) individuals anticipate the corre¬ 
sponding equilibrium transfers in the second period, then each player 
i should choose an a, that is optimal against the other’s choice. Hence 
such Nash equilibria (Oj, a 2 ) can be obtained as the intersections be¬ 
tween the two individuals’ “best-reply” correspondences. In fact, it 
can be shown that, in the present Cobb-Douglas setting, these corre¬ 
spondences always intersect at least once. 9 Hence existence of (sub¬ 
game perfect) equilibrium is guaranteed here. 


A. Equal Endowments 

To examine the nature of such equilibria, we will first consider the 
special case of one altruistic and one selfish individual with equal 
endowments: now assume «] = o) 2 = 1 and aj > 0 = o 2 . Then the set 
A 2 is clearly empty: individual 2 never wants to give a transfer since 
he is strictly selfish. It is straightforward to show that then the best- 
reply correspondence of the altruistic individual is the graph of a 
continuous, nonincreasing, and piecewise linear function as in figure 
2a. Since in the present Cobb-Douglas case his utility function for 
consumption is the same in each period and future consumption by 


* This follows directly from first-order conditions, just as in intertemporal Cobb- 
Douglas allocation of resources o t + a* over two time periods, with a discount factor a,. 

9 It turns out that each correspondence is the closure of the graph of a nonincreasing 
function that has at most one discontinuity. 
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assumption is not discounted, it is evidendy optimal for him to con¬ 
sume half his endowment in each period when no transfers are given. 
This is the explanadon behind the vertical part (in the region A 0 ) of 
his best-reply correspondence. Moreover, this individual knows that 
he will want to support individual 2 if the latter has little left in period 
2, so in such states he saves more than what would be optimal from a 
purely selfish viewpoint—and more so the less individual 2 has in the 
second period. This is the rationale behind the sloping part of his 
best-reply correspondence (in A t ). 

The best-reply correspondence of the potential recipient is some¬ 
what more complex. It turns out that the value of this correspondence 
is a one-point set for all «i # Sj, where 

/ 1\ ,/8 

S ‘“( 1+ £) -*• < 8 > 

At the critical point - a it each of two distinct values of a 2 is an 
optimal reply: individual 2 can then maximize his welfare either by 
saving half his endowment for period 2, <*2 = 1/2, or by saving only a 2 
= (1 - Sj)/2. If a 1 < a u then it is best for him to save “prudendy" (a 2 
= 1/2), while if a, > S It then a certain amount of “undersaving” is 
optimal, o 2 = (1 - aj)/2. In other words, the more resources individ¬ 
ual 2 expects 1 to save beyond 1/2, the less 2 should save, and the shift 
from the prudent saving regime to the undersaving regime takes the 
form of a “jump," as in figure 26. 

Note also that equation (8) confirms the intuition that the more 
altruistic individual 1 is, the wider is the range of values of a I at which 
undersaving is optimal. In particular, Sj > l for all a t below 1/3; that 
is, when the altruism of the potential donor is sufficiently small, then 
the potential recipient does not have an incentive to undersave, irre¬ 
spective of his expectation concerning the altruistic individual’s sav¬ 
ing. The reason is, of course, that the anticipated transfer will then be 
too low to make any level of undersaving advantageous. This actually 
proves that a* ~ (1/2,1/2) is the unique Nash equilibrium in (5, for aj 
<1/3 (see fig. 3a). 

Conversely, 2i < 1/2 for all a! above 4/5; then the potential recipi¬ 
ent has an incentive to undersave whenever the potential donor saves 
at least half his endowment for the second period, which he always 
does. Hence, there exists a unique Nash equilibrium in 6 when ai > 
4/5, and a transfer is then given (see fig. 36). In this equilibrium, the 
altruistic individual’s expectation about the other’s undersaving is 
fulfilled, as is the selfish individual's expectation about the support to 
be received. For some intermediary values of there are two equilib¬ 
ria: one prudent saving and one undersaving equilibrium. 
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B. Equal Altruism 

Let us turn to the polar case of equal altruism but differing endow¬ 
ments: a 1 = a 2 * a > 0 and u>t > « a > 0. It turns out that if 

aw 1 > w 2 , (9) 

that is, if the difference in wealth is sufficiently large and the degree 
of altruism is sufficiently high, then a transfer is given in equilibrium, 
from the wealthier to the poorer individual. Moreover, for some com¬ 
binations of parameters, the poorer individual will save some of his 
initial endowment, while for other combinations he will not save at all. 

To see this, consider figure 4, which shows l’s and 2’s best-reply 
correspondences (drawn for the special case aw 2 < wj/2 < w 2 /a). In 
contrast to the case of only one altruistic individual, there now ap¬ 
pears a region A 3 in the state space A. It can be shown that, when (9) 
holds, then individual 2’s switch from a prudent saving regime to an 
undersaving regime occurs at a value of a 1 that is lower than the 
individualistically optimal savings level, u>)/2, of individual 1. Thus 
the two correspondences intersect only in A t . If, in addition to (9), we 
have (2 + a)w e > W[, that is the difference in wealth is not too large, 
then there is a unique intersection, at a positive value of a 2 . 

Suppose, in contrast, that the difference in wealth is so large that (2 
+ a)a )2 « w t but (9) still holds. As is seen in figure 4, the two corre¬ 
spondences then instead have a unique intersection on the a ) -axis. In 
such an equilibrium, the wealthy individual’s saving is as large as it can 
be in any equilibrium, and the less wealthy individual consumes all his 
initial endowment already in the first period, correctly anticipating 
the transfer he will receive in period 2. 


IV, A More General and Rigorous 
Game-Theoretic Approach 

In the two-stage game outlined in Section II, which we will denote C. 
a (pure) strategy for individual i is a pair s f = (a it t,), where a, is his 
savings for the second period and t< a function that specifies the 
transfer t, to be given in every possible state a * (aj, a 2 ): /, * r,(a). 
The payoffs associated with any pair (s If s 2 ) of strategies for the two 
agents, sg) and ir 2 (si, s 3 ), are simply the corresponding (total) 
welfare levels defined in equation (4): 

*jUt, s 2 ) * u,[u>, - flj, a, - T,(a) + T,(a)} J0J 

+ CLiUj[ti>j - aj, aj + Ti(a) - Tji(a)], i = 1, 2,j*i. 

10 Formally, let A - (0, «>,] x (0, to?], and define the strategy set of individual i» *■ 
* {<«* tJ; a, 6 {0, «j], A R+ and r,(a) « a, V a e A). 




(1+o)«2 u> ( 


Fig. 4.—The “best-reply” correspondences of individuals I (in a) and 2 (in b) in the 
CMe °f «qual altruism. 





JOURNAL OF POLITICAL ECONOMY 


II76 

The second stage of the game G, played after the state a has been 
revealed, constitutes a subgame, G(a), in which a (pure) strategy of 
individual i simply is his transfer 4 . Hence, his pure strategy set in 
G(a) is the interval [0, a,], and his payoff function is 

*.(<1. fel*) == w.a(a, - U + tj) + 0,192(0, + - t>) 

( 11 ) 

+ UiiifOi - fl.) + a - a,), i*l, 2 , j * i. 

Note that the last two terms are exogenous to the subgame and hence 
are strategically irrelevant in the subgame. 

If a Nash equilibrium ((ai, T t ), (o 2 , t 2 )) in the full two-stage game G 
satisfies the further requirement that, for every state a', the point 
(-T) (a'), T 2 (a')) be a Nash equilibrium in the corresponding subgame 
G(a'), then ((a t , Ti), (og, t 2 )) is a subgame perfect equilibrium of G. 
This is the intertemporal equilibrium concept that we used in the 
Cobb-Douglas example above and that we will subsequently use be¬ 
cause it reflects the real-life difficulty of making commitments to fu¬ 
ture actions that will then not be optimal. 

To get some handle on the set of subgame perfect equilibria, let us 
first examine the set of (pure) Nash equilibria in a subgame G(a), 
where a is a fixed but arbitrary point in A = [0, o»i J x [0, u> 2 ]. The two 
(disjoint) sets A t and As are here defined by 

A, * {a € A; 1 / (0, 0) and u^fa.) < a ,uj 2 (q,)}, i ~ 1, 2, j /». 

Let A 0 be the subset of points In A that belong to neither Aj nor A 2 . 

Along with this partitioning of the state space A, a corresponding 
pair of functions, t* and t*, will be useful. Each such function if 
defines, for every state a in A, the transfer that individual i would like 
to give to j, granted that j gives nothing to i. Formally, t* (a) is defined 
to be zero for all a outside A, and, for a in Aj, as the (unique) solution 
of 


«< 2 (<*, - U) - *iVjs(aj + ti). ( 12 ) 

With the aid of these two “transfer functions” one can prove that 
there exists a unique Nash equilibrium in every subgame G(a) and 
that at most one of the individuals gives a transfer in equilibrium. 

Lemma. For every a E A, (t'J(a), t*(a)) is the unique Nash equilib¬ 
rium of G(a). 

This result implies that the subgame perfect equilibria of the 
(infinite-dimensional) two-stage game G are in a one-to-one corre¬ 
spondence with the Nash equilibria of a (two-dimensional) one-shot 
game. For, by definition, any subgame perfect equilibrium in G in¬ 
duces a Nash equilibrium on every subgame G(a), so if <(ai, ti)> 
t 2 )) is such a subgame perfect equilibrium, then, by the lemma, the 
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functions r, must be identical to the functions t * Hence, just as in the 
Cobb- Douglas example above, one can model the full strategic in¬ 
teraction between the two agents in terms of a one-shot game, (5, in 
which the strategies simply are the scalar variables ay and a 2 , and the 
payoff functions are defined by substitution of r*for t , in the payoff 
functions of the full two-stage game G: 


<* 2 ) 


- a,) + OiUjiiukj - a,) 

[u, 2 [a, - t* (a)] + a,u j2 [aj + T*(a)] for a 6 A, 


«is(a»j + %u JS (aj) 


for a 6 Ao 


(13) 


u, 2 [a, + r* (a)] + OiUjifaj - t * (a)] for a E A. 


(This is a generalization of eq. [7].) 


V. The “Smart Kid Theorem” 

Now suppose that a* is a (pure-strategy) Nash equilibrium of G, in 
which individual > makes a transfer to j, that is, a* E A,. Necessary 
first-order conditions then are 




(14a) 


uM c %) 


- 1 + (i 


«,o 2 ) 


dr* 

dOj 


if a* > 0 


a® 1 


+ (1 - Oi« 2 ) 




(14b) 


As can be expected, dr*/dOj < 0; that is, the transfer is decreasing in 
the recipient’s second-period resources. By hypothesis oia 2 < 1 , so 
condition (14b) implies that, whenever a* > 0, the recipient’s marginal 
utility of first-period consumption is lower than his marginal utility of 
second-period consumption. From a nonstrategic viewpoint, this is 
clearly not optimal for individual j since in the absence of interest 
earnings, the return to first-period savings is one to one. In other 
words, whenever he does some saving in equilibrium, then the strate¬ 
gic motive induces him to undersave, not only in comparison with 
what would have been optimal in the absence of a transfer, but also as 
compared with what would have been optimal had the equilibrium 
transfer been exogenously fixed. The reason for the “distortionary 
wedge" in (14b) is the (rational) speculation in the first period for 
more support in the second: individual 7 knows that the transfer to be 
received is a decreasing function of his own savings. Hence, such an 
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“interior" equilibrium (i.e., in which the recipient does some saving 
himself) is Pareto inefficient. For if the donor would be able to com¬ 
mit himself to the equilibrium transfer, then his individualistic utility 
would be unchanged, whereas the recipient could increase his indi¬ 
vidualistic utility by consuming less in the first period and more in the 
second . 11 This way, both individuals’ total welfare could be increased 
via the donor’s altruistic concern. 

The corner solution a* - 0 is slighdy more complicated because if j 
has a sufficiently small initial endowment and/or t is sufficiendy al¬ 
truistic, then it may indeed be in both individuals’ interest for j to 
consume his whole wealth already in period 1 . It turns out that an 
equilibrium with a* £ A, and a* = 0 is (Pareto) inefficient if and only 
if a*< y it where 7 , = max{o, £ [0, «<]; uj 2 [T*(a)] » uj t (Wj) when «, = 
0 }. 12 (With inefficiency, here we mean a deviation from what would be 
feasible under commitment, as discussed above.) Not surprisingly, 
this upper level on a* for inefficiency is decreasing in a, and increas¬ 
ing in < 1 tj (but is functionally independent of a, and <d,). 

One may also say that the problem of inefficient equilibria in this 
model is an example of free-riding. The reason for borrowing this 
term from the theory of optimum supply of public goods is that 
altruism renders a public-good character to individual consumption 
since (at least) one individual’s consumption then enters also others' 
utility functions . 18 In the public finance literature, an individual is 
said to be a free-rider if he does not contribute to the provision of a 
public good on the grounds that others will pay enough to cover its 
financing. But in the present context it is not clear what one should 
mean by an individual’s “contribution” to his own welfare—when this 
is a public good. 

Hence, we need a definition of free-riding for situations of strategic 
interaction between altruistic individuals. In such contexts, it seems 
natural to say that an individual is a free-rider if his strategic behavior 
induces others to contribute more to his welfare than they would like 
to, had they had the possibility to commit themselves to a support of 
their own choice. This intuitive notion can be formalized as follows. 

Suppose that a* is a Nash equilibrium in 6 , and assume that a* £ A, 
for i # 0. Then t* - T*(a*) > 0 is the transfer that i (voluntarily) gives 
to j in this equilibrium. If individual i instead were able to commit 
himself in advance to any transfer f in [ 0 , o>j], while the intertemporal 
allocation of consumption were left at each individual’s discretion, 

11 Note that the commitment is to give exactly t* neither leu nor more. 

1J Note that <•)], when Oj = 0, is continuous in a, > 0 and tends to plus infinity 
as a, -* 0. 

'* The public-good character of others’ consumption, in the presence of altruism. > 5 
discussed in an atempora! setting in Buchanan (1975a). 
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then the individualistic utilities associated with such a transfer would 
be 


v,-(0 = max u,(ta), - a it a t - t) 

subject to a, € [t, «<], 

(the donor) 

(15a) 

vj(t) = max - Oj, a } + t) 

subject to Oj E [0, <o ; ]. 

(the recipient) 

(15b) 


The total welfare that i would derive from such a committed transfer t 
would thus be V,(t) = w,-(0 + a > v j( 1 )- It can be shown that the (indirect 
utility) function V,: [0, to,] —*■ R is continuous and stricdy concave, and 
hence achieves its maximum at a unique level t. In the spirit of the 
intuitive definition above, we accordingly say that; is a free-rider in a* 
if i<t*. 

If a* is “interior” (in the sense a* > 0), then V,'(/*) < 0 by the 
envelope theorem (and conditions [14]). Thus t < t*, by concavity of 
V t , so the recipient indeed is a free-rider by the definition above. If 
instead a* - 0, then one can likewise show that Vj'(/*) < 0 if a* < y„ 
and V'ift*) = 0 if a* 3= y,. Hence, j is then a free-rider if and only if 
a*< yj. In sum, an equilibrium a* E A„ for i ¥= 0, is socially inefficient 
if and only if j is a free-rider. 

Applying these conditions for inefficiency and free-riding to the 
example in Section 1I1A of an altruistic and a selfish individual with 
equal endowments, we immediately find that when the altruism of the 
first individual is large (a! > 4/5), then the (unique) equilibrium is 
inefficient and the selfish individual is a free-rider (since then a* € A\ 
and a* > 0). 

In the example of equally altruistic but differently endowed indi¬ 
viduals in Section 1115, it was found that a transfer is given from the 
wealthy to the poor, in equilibrium, if the endowment of the less 
wealthy individual is less than the share a of the more wealthy individ¬ 
ual, where a is the (common) degree of altruism (condition [9]). This 
is the area below the diagonal in figure 5. For certain combinations of 
endowments and altruism (in this area) involving a relatively small 
difference in endowments, the poor individual will do some saving in 
equilibrium, which results in inefficiency and free-riding. This is area 
h in the figure. For other combinations, he will save nothing, in which 
case we have to compare a* with yj in order to know whether he is a 
free-rider or not. In this example, a* = (1 + a)ujj/(2 + a) and yi = 
®s(l + a)/«, so the equilibrium is inefficient if o<di < (2 + a)«g (see 
area /j in the figure), while it is efficient if omi»i »(2 + a)*>2 (this is area 
f)- In the latter , area, it is in both individuals’ interest for the poor 
individual to consume all his wealth in the first period. 
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Fig. 5 . —Efficient (£) and inefficient' (/, and / 2 ) equilibria in the Cobb-Douglas ex¬ 
ample of equally altruistic (a) but differently endowed individuals (ii»i > 


VI. Filial Remarks 

The binding agreements that are necessary to avoid inefficiency and 
free-riding are difficult to enforce in practice. Then what are the 
realistic possibilities to reduce or, ideally, eliminate the “strategic dis¬ 
tortions” analyzed above? 

In the case of life cycle savings, the analysis above suggests that 
compulsory savings systems could be welfare improving even when 
actuarially fair. This opens up the possibility for an explanation, or at 
least a rationalization, of social security systems along the lines of free¬ 
riding, as a complement to the “traditional” explanations based on 
market failure (for instance due to adverse selection or moral hazard), 
paternalistic concern for myopic citizens, or ambitions to redistribute 
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income within and between generations. 14 However, there are some 
limitations to this free-riding argument for social security. 

First, for an actuarially fair compulsory savings system to be effec¬ 
tive at all, it is evidently necessary that claims to future pensions cannot 
be fully used as collateral in the credit market. Second, even in the 
more realistic case in which claims to pensions cannot be effectively 
used as collateral for loans in the first, active, period of life, a compul¬ 
sory savings system has to be binding for an individual that would 
otherwise be a free-rider. Moreover, the net welfare effect of such a 
compulsory savings system will be positive in equilibrium only if the 
welfare gain from the reduction of the “strategic” distortion exceeds 
possible welfare losses due to distordons caused by the compulsory 
savings system in combination with the assumed imperfection in the 
credit market. Hence, an actuarially fair compulsory savings system 
may very well turn out to be ineffecdve in many situations. An inter¬ 
esting question for future research then is whether there exists some, 
possibly redistributional, social security system that can overcome 
these difficulties. 15 

In situations in which common concern, as modeled above, is oper¬ 
ative within or between organizations—for example, between the 
state and state-owned firms or an international aid organization and 
"client” countries—costs due to “strategic” distortions may clearly 
arise. This is a second line for further research. In particular, suppose 
that individual 1 in the model is one unit of an organization and 
individual 2 another unit of the same organization. Suppose that the 
first unit strives to maximize some overall “performance index” ui + 
u 2 , where ui is the component associated with its own activities and u 2 
the component associated with the second unit. In contrast, the sec¬ 
ond unit strives to maximize its performance index (u 2 ) only. The 
analysis in this paper then shows that, for certain combinations of 
endowments and preferences, the second unit will use up the bulk of 
its resources already in the first period, in anticipation that the first 
unit will then transfer additional resources in the second period, even 
if this leads to an (ex ante) inefficient outcome. This phenomenon 
may be interpreted as a “soft” budget constraint in the sense of Kor- 
nai (1980a, 1980ft). 
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Adult equivalence scales are supposed to measure differences in the 
“needs” of households of different demographic composition. For¬ 
mally, they purport to measure the change in the cost of attaining a 
certain welfare level when the family composition varies. This paper 
shows that the definition and the measurement of these scales de¬ 
pend crucially on the concept of welfare used. When welfare is the 
utility parents derive from their own consumption, one has to as¬ 
sume separability of parents’ and children's consumption. This as¬ 
sumption implies that the only way of imputing the intrafamily allo¬ 
cation of resources is by observing the consumption patterns of adult 
goods. Regardless of the definition used, one cannot separate the 
factors reflecting home technology (i.e., “needs”) from those deter¬ 
mining the intrafamily distribution rule (i.e., “wants”) out of con¬ 
sumption data. 


I. Introduction 


Adult equivalence scales have recently celebrated their first cen¬ 
tenary, but the debate on how to estimate them and their welfare 
implications has not yet subsided. 1 


I benefited from the comments of Cary Becker, Angus Deaton, Zvi GrUiches, Ed¬ 
ward Lazear, Robert Michael, Robert Poliak, Eytan Sheshinski, Menachem Vaari, 
Shlomo Yitzhaki, an anonymous referee, and participants in workshops at the National 
Bureau of Economic Research, the University of Chicago, the University of Wiscon- 
•*n—Madison, Columbia University, and the Hebrew University. The research was 
supported by the U.S.-lsrael Binational Science Foundation and by National Science 
Foundation gram SES-8520219. 

For a recent manifestation of this controversy, see Deaton and Muellbauer (1986). 
* ‘*** Sjr The Univcnlty of Chicsgo. AH rights reserved. OWWaOM#WOWOOS$OI.SO 
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Adult equivalence scales are (quoting Deaton and Muellbauer 
[1980}) a sophisticated way of head counting when comparing the 
living standards of families of different size and composition. They 
are supposed to account for differences in the “needs” (or “require* 
ments”) of people of different ages and sexes, and for returns to scale 
in home production, providing a deflator by which the budgets of 
different household types can be converted to a need-corrected basis 
to allow for welfare comparisons. 

The formal analysis of adult equivalence scales often follows that of 
price indices. Whereas the latter measure the change in the cost of 
attaining a certain welfare level when prices vary, the former purport 
to measure the change in costs when family composition varies. Since 
the household's welfare is unobserved, the measurement of price in¬ 
dices is based on the assumption that the welfare the household 
derives from a given bundle of goods is unaffected by the price 
change. This assumption cannot be applied in the case of demo¬ 
graphic changes. The same bundle of goods is assumed to generate 
different levels of welfare as the household's demographic composi¬ 
tion changes. Since there is no direct way of comparing the welfare 
levels of families of different composition, the consumption of 
specific goods (or their share in total consumption) is used as a gauge 
of welfare. The question what the criterion is for the choice of these 
specific goods has not yet received a satisfactory answer. 

A special feature of the adult equivalence scales literature is the 
emphasis on the changes in needs associated with a change in family 
composition. Family needs have never been well defined in this litera¬ 
ture. Trying to infer the change in needs from consumption patterns 
of families of different composition, one often forgets that these pat¬ 
terns are affected not merely by a change in “needs” but also by a 
change in “wants." The needs-wants distinction plays only a minor 
role in neoclassical theory. There is an analogous distinction, based on 
the theory of home production, between consumption technology 
and intrafamily allocation. The evaluation of the welfare implications 
of equivalence scales depends heavily on the answer to the question 
whether, in consumption data, one can separate home technology 
from distribution decisions. 

Finally, recent criticism (Poliak and Wales 1979) pointed out the 
restrictive nature of the definition of welfare implicit in the analysis— 
a definition that ignores the direct effect of changes in family size 
(e.g., children) on welfare. The nature of these restrictions and the 
underlying assumptions concerning the family welfare function are 
essentia] to the understanding of equivalence scales and their es¬ 
timation. 

In reexamining adult equivalence scales, one cannot separate these 
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issues. The nature of the family welfare function and the distinction 
between consumption technology and distribution dictate the estima¬ 
tion procedure. To understand adult equivalence scales, one has to 
understand the process of intrafamily allocation. 

The change in focus calls for a change in approach. Whereas tradi¬ 
tionally the topic is discussed within the framework of consumption 
theory, the departure point of this paper is the economics of the 
family. 1 open with a brief description of the traditional methods of 
estimating the scales, pointing out the difficulties involved. Analyzing 
the nature of the welfare function implicit in traditional analysis, I 
point out the effect of intrafamily distribution on welfare and discuss 
the factors governing the allocation of resources between parents and 
children (distinguishing between home technology and other factors). 
In the absence of direct information on intrafamily allocation, it has 
to be imputed from observed consumption patterns. The theoretical 
analysis serves as a major guideline in this imputation, and the sug¬ 
gested procedure is compared with existing methods. The paper 
doses with a discussion of the welfare implications of adult equiva¬ 
lence scales. 

II. Background: The Traditional Methods of 
Estimation 

Equivalence scales purport to measure the relative incomes required 
to enable families of different size to enjoy the same standard of 
living. Their derivation by empirical analysis of expenditure patterns 
is usually regarded as superior to arbitrarily setting households’ objec¬ 
tive needs and determining the cost of achieving them. Unfortu¬ 
nately, “standards of living” (or welfare levels) are unobserved. 
Whereas, in general, income or consumption levels can serve as indi¬ 
cators, an inherent part of the problem is that the welfare generated 
by a fixed bundle of goods changes as family composition changes. 
Much of the literature is devoted to the search for a substitute indi¬ 
cator of welfare. 

The oldest method of deriving equivalence scales is the one pro¬ 
posed by Engel (1895): the proportion of expenditure X spent on 
food is the best measure of the material standard of living. Thus let q 
denote the amount of food consumed (or expenditures on food). In 
comparisons of a family of composition h with the base group *0 (fig* 
1 ). both families enjoy the same welfare if they spend the same pro¬ 
portion of their expenditures (q/X) on food. Their relative expendi¬ 
ture levels (X/Xo) can be used as an "objective” measure of the in¬ 
case in family needs, that is, as equivalence scales. 

Rothbarth (194S) suggested replacing food expenditures by the 
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absolute expenditure on adult goods: goods consumed exclusively by 
adults. By this method the comparison of incomes (or total expendi¬ 
tures X) of households with the same level of expenditure on adult 
goods (fig. 2) yields the scales X/Xo (q in this case is the amount of 
adult goods consumed). 

As household size increases (k > ko), one expects X > X 0 . When total 
expenditures (X) are held constant, an increase in household size is 
associated with an increase in the consumption of some goods and a 
decline in that of others. The Rothbarth method yields the expected 
result (X > Xo) as long as the expenditure on the goods used to gauge 
welfare declines as household size increases. With the Engel method, 
the condition is satisfied in the case of income-inelastic goods (such as 
food) if their expenditures increase with family size and in the case of 
income-elastic goods if they decline with family size. 

The scales naturally depend on the specific good used as an indi¬ 
cator of welfare (as well as on the level of total expenditure). Eco¬ 
nomic theory is not very helpful in choosing the "true” indicator of 
welfare (if any). Some studies (Espenshade 1984) have investigated 
which good (or group of goods) best satisfies certain prior require¬ 
ments that equivalence scales are expected to meet. This procedure 
runs the risk of tautology: it chooses the indicator that yields the 
“best” preassumed results. As summarized by Deaton and Muellbauer 
(1986), in this case model selection is a crucial determining factor, the 
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Fig. 2 


choice of concept dictates the choice of a functional form, and there is 
no way of judging which method is superior. 

To circumvent this criticism, researchers often revert to the Barten 
(1964) method, estimating scales that are weighted averages of all 
goods-specific scales (the weights being the corresponding price elas¬ 
ticities). According to Barten, the household’s welfare function has 
the form 


.(£)]• (l) 

where q, are the quantities of the goods consumed, and m, are the 
goods-specific deflators. These deflators are assumed to be exoge¬ 
nously given; they depend solely on demographic variables (such as 
family size and composition) but are independent of economic vari¬ 
ables (such as income and prices). By this argument a change in the 
household’s demographic structure involves not only a “real income" 
effect but also a price effect: the price of goods at which m, is sensitive 
to changes in demographic variables changes relative to those at 
which mi is insensitive. 

The Barten method may seem to solve the problem entailed in the 
arbitrary choice of one good as an indicator of welfare, but the ex¬ 
plicit formulation of the welfare function uncovers some much more 
fundamental questions; (o) Whose welfare is described by equation 
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(1)? Is it the parents’, the children’s, or some “average” welfare con¬ 
cept? If it is the parents’ welfare, is it from total household consump¬ 
tion or merely from their own consumption? 2 (b )To what extent can 
irii be regarded as exogenous parameters that are unaffected by eco¬ 
nomic variables? In general, economists are reluctant to distinguish 
between needs and wants, where the first are supposedly not affected 
by economic variables but the latter are. Should an exception be made 
in this case and the be regarded as needs, that is, parameters of 
home production technology that are exogenously given? If so, can 
one isolate these parameters from consumption patterns? The an¬ 
swers to these questions call for further formal treatment of adult 
equivalence scales. 


III. Adult Equivalence Scales and Intrafamily 
Allocation 


The formal analysis of adult equivalence scales closely follows that of 
price indices. Formally, let C(U, p, A) be the cost to the household of 
attaining welfare level U given the price vector p and family composi¬ 
tion k. Then the adult equivalence scales are defined as 


C(U, p, k) 
m C(U, p, Ao) * 


( 2 ) 


that is, the relative cost of attaining a welfare level U, given a price 
vector p when the family composition is k compared with a family 
composition A<>. Let k denote a family with children and Ao a family 
without children; then the denominator, of course, relates to the par¬ 
ents’ welfare. Since the same criterion of welfare has to apply to the 
numerator, it is only natural to assume that the welfare yardstick in 
this comparison is the parents’ welfare (Deaton and Muellbauer 
1986). 3 Children’s welfare is relevant only to the extent that it affects 
their parents’. 

But this immediately gives rise to a second question: What concept 
of parents’ welfare is used in this definition? As pointed out by Poliak 
and Wales (1979), unlike price changes that are usually not controlled 
by the household, changes in family composition (marriage, divorce, 
or additional children) are very often volitional. With the standard 
concept of utility, m becomes merely a measure of the desirability of 
the change in the family composition (e.g., it will be less than unity 


* Barten, who was primarily concerned with demand analysis, regarded U as the 
household utility function and did not bother to specify which household members he 
had in mind. 

* A more detailed discussion of the family wetfare function appears in Poliak (1985). 
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whenever the change is desirable and will exceed unity when the 
change is undesirable). 

Since this is clearly not the purpose of adult equivalence scales, one 
has to revert to a more restrictive definition of the parents’ welfare: 
the utility parents derive from the goods and services they consume 
themselves (Deaton and Muellbauer 1986). But this restriction is not 
sufficient to "save” these scales. If changes in family composition af¬ 
fect the utility that parents derive from their own consumption (e.g., 
children affect the utility parents derive from watching television or 
taking a trip), then the same bundle of goods may generate different 
welfare levels as the family composition changes. From the observa¬ 
tion that parents consume the same bundle of goods and services 
before and after having children, if one is to conclude that their 
welfare (in the restricted sense) has not changed, one has to add the 
assumption that the existence of children and the composition of 
their consumption do not affect the utility parents generate from 
their own consumption. 

The assumption of separability is a crucial (though often ignored) 
component of the welfare analysis concerning changes in family com¬ 
position and a necessary ingredient in the definition of adult equiva¬ 
lence scales. It is also the cornerstone of the estimation procedure 
discussed in the next section. 

Separability of parents’ and children’s consumption seems, at first 
glance, a hard assumption to defend. (Don’t we parents always claim 
that our children have changed our lives?) But after consideration, it 
looks like an assumption one can live with (though not without reser¬ 
vations), and in fact most economists have always taken it for granted. 

Given the misunderstandings surrounding the concept of separa¬ 
bility, some clarifications are called for. Separability does not rule out 
the assumption that parents care for their children and care what 
their children consume. Weak separability rules out the case that 
children’s composition of consumption affects parents’ preferences 
for their own consumption (i.e., the marginal rate of substitution 
between any two goods consumed by the parents). 4 Thus the assump¬ 
tion is consistent with parents’ stuffing their children with spinach but 
is inconsistent with their starting to eat spinach themselves to set a 
personal example. More seriously, it is inconsistent with children's 
affecting their parents’ pleasure from watching television, going to 
the ball game, or listening to a Bartok quartet. The importance of 
these interactions between parents’ and children’s consumption is not 
dear and calls for more empirical evidence. 

* Separability also implies that the composition of parents' consumption does not 
affect the utility they derive from their children’s consumption. However, this part is 
required in the analysis. 
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A second source of interaction between parents’ and children’s con¬ 
sumption may originate from the existence of “family goods,” that is, 
public goods consumed by the family (e.g., housing, appliances, or 
electricity). Distinguishing between parents' and children’s private 
goods and family goods, one can still maintain the assumption of 
separability of parents’ and children’s private goods as long as par* 
ents’ private and family goods are separable. When the link between 
parents’ and children’s consumption of private goods is broken in this 
fashion, adult equivalence scales have to be redefined to apply to the 
welfare generated by parents’ consumption of private goods. The 
damage caused by such a redefinition is not dear; it is, however, in 
line with the long tradition of ignoring public goods both in demand 
studies of private goods and in the measurement of other welfare 
indices (e.g., the price index). 5 

It is often hard to distinguish between family goods and indi¬ 
visibilities (or adjustment lags) in consumption. Housing and utilities 
are frequendy cited examples of family goods, but as a family ex¬ 
pands it eventually moves to a larger house (not only with more bed¬ 
rooms but with a larger kitchen and more bathrooms), and it uses 
more water, gas, and electricity (not to mention telephone services). 
Excess capacity in housing and appliances may reduce the fixed costs 
of additional children but does not stand in contrast to the separabil¬ 
ity of parents’ and children’s consumption. 

An associated issue is returns to scale in consumption. 6 Even if 
preferences are separable, the composition of parents’ and children’s 
consumption may not be independent because of increasing returns 
to scale. Thus parents may switch from lettuce to spinach not because 
they feel they should set a personal example but because it is 
“cheaper” to prepare one big bowl of spinach salad than two separate 
salads. Children’s clothing handed down from older to younger chil¬ 
dren is another example of these economies of scale due to sharing. 
Fixed costs of home production may create a link between parents' 
and children’s consumption when it comes to all-or-nothing decisions 
(e.g., should the parents have a lettuce salad?) but not when marginal 
decisions are concerned (e.g., how much lettuce should be used for 
the parents’ salad?). Decreasing incremental costs of children can be 
incorporated into the model without infringing on the assumption of 
separability. 

Finally, the definition of adult equivalence scales (m) in equation (2) 
rests on the assumption that demographic changes do not affect 

5 The damage may be more severe if private and family goods are more closely 
related than (household) private and public goods. 

* For a more detailed analysis of returns to scale in consumption, see Poliak and 
Wales (1981). 
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prices. But even if children do not affect the parents* preferences for 
their own consumption, they may affect the shadow prices facing the 
household. It is well established that children increase their mothers’ 
shadow price of time and hence increase the relative price of time¬ 
intensive activities. Even in a model that ignores the allocation of time 
(as is the case in all studies of adult equivalence scales), this should be 
borne in mind. Sometimes children also affect the pecuniary costs of 
an activity (e.g., the cost of a visit to the theater is lower for a childless 
family than for a family that has to pay for babysitting). These cases 
are, however, rare. 7 

In summary, though the assumption of separability should be ac¬ 
cepted with reservations, their relative importance (compared, e.g., 
with the reservations accompanying any study of demand or welfare 
indices) cannot be ascertained without additional empirical research. 

Returning to the formal presentation, let parents’ welfare, U, con¬ 
sist of the welfare they derive from their own consumption U A and 
the welfare they derive from having children and from their chil¬ 
dren’s consumption, U B : 

U = f(U\ U B ). (3) 

A prerequisite for the measurement of adult equivalence scales is the 
separability of U A and U B . 

Thus U A depends solely on the “commodities” consumed by par¬ 
ents (Z A ), 

f/ A = C/ A (Zf,Z£. Z A ), (4) 

and U B depends on the number of children, K, and their consump¬ 
tion (Zf), 

U B « t/ B (Z?, Zf. Z B ; K). (5) 


Family goods are ignored. 

To introduce differences between parents’ and children’s needs, it 
is assumed that the home production functions of parents’ and chil¬ 
dren’s commodities differ. For simplicity, let 

ZA = q ' 8 * 


Zf 


-A 

Pi 




( 6 ) 


Pi 


I intentionally dp not mention in this context changes in the Barten prices. Barten 
signed that the price of goods changes by a factor of a, if children share in the 
con *umption of that good. I refer to this point later. 
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where q A and qf are the quantities consumed by parents and children, 
respectively. The parameters 8 / denote the minimum subsistence 
levels of q{, and p/ are the marginal inputs of q{ in the production 
of the commodity Z/. 8 If children’s needs are conceived to be lower 
than parents’, then 8 ? < 8 ,* or pf < p A . For example, if Z< stands for 
nutritional requirements, it may be assumed that the subsistence 
levels of children are lower than those of parents, and the same 
amount of food produces higher nutritional values for children than 
for adults. 

Welfare is maximized subject to the budget constraint 

SM. + C(K) = X, (7) 

where < 7 , ~ q A + qf, p, denote prices, C(K) are the fixed costs of having 
children, and X is total consumption. I assume a one-period model, a 
fixed supply of labor, and an exogenously given number of children . 9 

The assumption of separability implies two-stage budgeting (Dea¬ 
ton and Muellbauer 1980, chap. 5 ). 10 The allocation decision is made 
in two stages: In the first, parents decide how to allocate the budget 
between their own consumption (X A ) and that of their children (X 8 ), 
where 

X A + X* - X - C(K). ( 8 ) 

In the second stage they decide how to allocate each budget, X 1 , be¬ 
tween the different goods, where 

= P, j - A, B. (9) 

The utility the parents derive from their own consumption (U A ) de¬ 
pends exclusively on their own budget, X A , and is independent of the 
children’s consumption, X B , and their number: 

U A = V A (X A , p). (10) 

Similarly, the utility the parents derive from their children’s con¬ 
sumption is 

l/ B - V^X®, p; K). (ID 

Stated differently, the cost of the parents’ consumption that will gen¬ 
erate a welfare level U A is X A . To know the expenditure level that will 

* A similar formulation is used by Poliak and Wales (1981) as the household utility 
function. Note that in the present case h refers to the individual household member. 

9 Alternatively, one can assume that the utility function is separable in goods aw 
leisure and in current and future consumption. 

10 For further discussion of the implications of two-stage budgeting, see Strou O 95 ;- 
1959) ami Gorman (1959). The discussion here ckuely follows that of Samuebon* 
(1956) seminal analysis of family indifference curves. 
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assure the parents in a family with K children of a welfare level U A , 
one has to decipher the first-stage allocation decision 

X A = h{X, p. C; K). (12) 

Hence, a normative study of equivalence scales has to start with the 
study of the intrafamily allocation of goods. 

It is expected that parents’ consumption will be positively related to 
total family consumption. Price changes have a real-income effect, but 
they may also change the relative price of X A if the composition of 
goods constituting X A differs from that in X B and when the price 
change does not affect all goods equally. The effect of an increase in 
the fixed costs of children C (or in their number) is similar to that of a 
reduction in total resources and should reduce X A . 

The optimum combination of goods depends on consumption tech¬ 
nology, that is, the way goods are transformed into utility (eqq. [4] 
and [5]) and the distribution of welfare within the family (eq. [3]). 
How is consumption technology reflected in the intrafamily allocation 
of goods? The conditions defining the optimum 

^ = \p it j = A, B, (13) 

P/ 

state that the marginal utility of each dollar spent on goods consumed 
by parents and children has to be the same, where u/ = ( dUldU } ) x 
0 iiPidZi ). 

Even if an outside researcher could observe the individual con¬ 
sumption of each family member, there is no way of disentangling 
wants from needs. If one member is observed to get a greater share of 
total resources, there is no way of telling whether this should be 
attributed to his greater needs (p/) or to the fact that he enjoys 
privileged status with the family decision maker. There is no way of 
inferring changes in needs (or relative needs) from consumption pat¬ 
terns without making explicit assumptions about the parents’ welfare 
function U *= f(U A , t/ B ). Specifically, one cannot learn anything about 
children’s needs compared with those of their parents unless it is 
assumed that parents assign the same weight to their children's con- 
' sumption that they assign to their own (i.e., U A - U ®). n 

The assumption that the weight placed by parents on their chil¬ 
dren’s consumption ( dUIdU *) does not change with the children’s age 
is a necessary (but not sufficient) condition for inferring the change in 
children’s needs with age from the changes in consumption patterns. 


1 Ifone could observe the individual consumption of each houiehokJ member and it 
inf? 1 ?* 1 ** marginal utility functions (iVtaUtyW’liZl) are the same, «ie can 
ihe relative need* (p?/pfr by studying the relationship between q? and qT- 
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When (6) is inserted in equation (9), 

X 8 = Iptqf - Ipi&f + XftpfZf. (14) 


Differences in children’s needs (pf) play a role similar to variability in 
prices. Under the assumption that parents’ care for their children 
(Uf) does not change with children’s age, an increase in children’s 
needs as they grow older raises the price of children’s commodities 
(Z?) and tends to lower their consumption. If all p® increase propor¬ 
tionately, the children’s budget (X 8 ) will increase only if the rate of 
decline of Z 8 falls short of the rate at which p 8 increases. Thus the 
children’s budget increases with age at the expense of their parents’ 
only if the elasticity of substitution between parents’ and children’s 
consumption is less than unity. 13 

An increase in children’s subsistence level (6?) with age is analogous 
to an increase in the hxed costs of produJjing Z 8 . It will increase the 
children’s budget as long as the increase ii fhted costs is not offset by 
the decline in children's commodities (Z$&that is, as long as the mar¬ 
ginal propensity of children’s consumpMp#X B /dX) is less than unity. 


IV. The Identification 
of Resources 


location 


There is no direct evidence on the allocation of most goods within the 
household. In the absence of such data one has to impute the parents’ 
and children’s consumption from data describing the household’s 
overall consumption. How does the formal analysis here help in this 
imputation? >.• 

Under the assumption that there are no public goods in the house¬ 
hold and given the assumption of separability, the consumption of 


11 For example, let U be a constant elasticity of substitution function, 

u - [»(i/ A r 1 ' + a - WJ*r i r l ' T . 

and U A and (/" be Cobb-Douglas, 

i/ A - <?. A r<?£) , - a . 

where p ■* pfp*~*. Then 

- n *«( T rj)v ~ ° ^ y *** *• 

where cr « 1/(1 + -y) is the elasticity of substitution (<r > 0 ). The parents' share increases 
with the distribution parameter ft. It iaNnj»ses with children's needs (p) when a > 
1 and with the price of q, if parents’ consumption is more intensive in f, (Le., a > 0 ) 
if «r < 1. 
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good i by parents is solely a function of their budget (X A ), prices (p), 
and consumption technology (6 A and p A ): 

= p,« A ,p A ). 

Similarly, for children, 

qf ■ £?(**. p. 6 B » P 8 ; K). 

Total family consumption of good t therefore equals 

<?. - qt + qf = gflW ), p, 8 A , p A ] 

+ g?[x-c-A(), P , 8?, P ?;X], 

where A() = h(X, p, 5, p, C; K ) is the intrafamily distribution rule (eq. 
[12]). Thus, in general, the family consumption function reflects both 
factors that are goods-specific (g A and gf) and the distribution rule. 
Assuming separability and identical tastes, one can identify the fac¬ 
tors that are specific to parents’ consumption (g A ) by observing the 
behavior of childless families, but there is no way of disentangling the 
factors specific to children’s consumption (gf) from the distribution 
rule A by observing the consumption patterns of families with chil¬ 
dren . 15 

One may postulate that parents’ and children’s consumption func¬ 
tions are identical, g A = gf = gj. Intuitively, this assumption seems 
unrealistic. Moreover, it does not save the day: in general, it does not 
ensure the identification of the distribution rule . 14 

Studies of the effect of demographic variables on demand often 
assume that this effect is additive . 15 Specifically, these studies ignore 
any interaction between demographic variables and family income; 
the marginal propensity to consume is unaffected by family composi¬ 
tion. They therefore implicitly assume that the marginal propensity to 
consume (dq’JdX*) is constant and identical for parents and children; 
that is, gj is a linear function in X ; . In this case the demand is unaf- 

IS For example, let be a linear function of X J , * o„ + a.\X ] (j = A. B). and X J be 
a linear function of X, X A ■= fo + 0,X,X B * -Po + (1 - Pi)* Then 

f> * f A + qf “ (a<> + a®) + (of - af)0o + I« A Pi + «f(l ~ P«)]X, 

and there is no way of separating the a's from the P’s. Note that I do not refer to 
identification in the statistical sense, an issue raised in the context of the estimation of 
the Barter scheme. 

[ 4 For example, iff is a quadratic function of X J ,q{ * ao + a t X ; + a s (X J ) t , the family 

i, consumption is (ignoring fixed costs) q, - 2o« + <J]X + «*X* - 2a*X A (X - X A ), and 
l there is no way of identifying X A from this equation. 

‘ Fotlak and Wales (1981) denote this case as “demographic translating.'' In this case, 

1> * + fr(X, p); the demographic variables (X) do not affect the marginal 

Prop ensity to consume. This assumption h unpbek, For example, in all pasi attempts to 
“twwe the demandfbr adult goods (e.g., Nicholson 1949,1976). It is dominant in the 
uterature discussing the supply of labor. 


(15) 

(16) 

(17) 
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fected by the intrafamily distribution of resources. Unfortunately, it 
follows that there is no way of identifying the distribution rule from 
consumption data. 16 

A necessary condition for the identification of A is the assumption 
that some of the goods are not consumed by children (j? * 0 ). The 
demand for these adult goods by families with children is 

- gfiH ). p]- (18) 

Since g A can be estimated from a sample of childless families, one can 
identify A as long as g? is a monotonic function in X A : 

*0 - ( 19 ) 

The existence of adult goods (and the monotonicity of g A ) is a 
necessary condition for the identification of A. It does not, however, 
provide the researcher with a way of separating home technology 
from preferences. In the previous section I distinguished between 
fixed costs of children (C) and two groups of parameters that deter¬ 
mine consumption technology (8„ p;). Fixed costs play a role similar to 
taxes, reducing disposable income. A change in fixed costs associated 
with children reduces the parents’ budget (X A ) and their consumption 
of adult goods. There is, however, no simple way of separating this 
effect from other fixed effects of children. 18 

'* Let qj “ aj + ajX>. Then the family demand is qt «* (a* + <*o) + <*i(X - C). Al 
best one can learn from demand patterns something about the fixed cost (C). Nothing, 
however, can be learned about the distribution rule A. 

” Differentiating (17), we get 

/ 9gt V \ , / 9 S f V» 9X A \ 

9X “ Ux a A 9X ) “ »x )• 

+ Oil + (M. 

sz " az az \ a x A sx»A az )’ 

where X A * A() denotes the parents’ budget (eq. [12)), and Z denotes any other 
variable in eqq. (15) and (16). One can estimate BqJiX and dqJhZ from the sample of 
families with children, and one can estimate the function g A from a sample of families 
without children, but one cannot derive the term dX A !bX from the first equation with¬ 
out specifying the value of 8g“/dX*. The simplest specification is naturally 9g?l9X* * 0 
Similarly, one cannot derive iX A ldZ from the second equation without specifying 
igfiaZ. Again, equating this term to zero is the simplest specification. Identification u 
not served by the existence of goods that are consumed exclusively by children (q A * 0 
since in the absence of ‘‘parentless" families one cannot isolate g,. Even if they existed 
these families should be removed from the sample because they are subject to a differ 
ent distribution rule. 

*® For example, if A it linear, 

X A « fio + fii(X - C) - (ft, - PtQ + M- 
Even if Pi can be identified, one cannot separate C from fo without making addition 
assumptions about fio (e.g., that jJ® “ 0 or that fio it unaffected by the number 
children). 
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A change in children’s needs may affect their parents’ consump¬ 
tion. In principle, one should be able to trace changes in the parame¬ 
ters p B by observing the change in consumption patterns with chil¬ 
dren's age. But since the effect depends not only on the change in the 
consumption technology parameters but also on the substitution elas¬ 
ticity, one cannot separate the two without prior information on the 
substitution elasticity of parents’ consumption. A decline in parents’ 
consumption as children grow older is consistent with an increase p B if 
X B is price inelastic, but also with a decline p B if X B is price elastic. 

How does our measure compare with traditional estimation meth¬ 
ods? The belief that adult goods are the key to the identification of 
the distribution rule (and adult equivalence scales) is, of course, not 
novel, taking us back to Rothbarth. 19 What is novel is the interpreta¬ 
tion of the estimates and the argument that this method is generated 
direcdy by the underlying assumptions of the model, within the tradi¬ 
tional theory of consumption. It has been shown that the definition of 
adult equivalence scales as an index that measures the relative costs of 
attaining a certain level of parents’ welfare when there is a change in 
family composition rests on (in the absence of a direct way of measur¬ 
ing welfare) the assumption of separability. Furthermore, any com¬ 
parison of consumption patterns between families of different com¬ 
position is based implicitly on the assumption that the underlying 
utility function of parents from their own consumption (eq. [4]) is 
unaffected by a demographic change. In the absence of separability 
one cannot tell, from comparing the consumption of a certain good 
by families with and without children, whether a decline in parents’ 
consumption is caused by a diversion of resources from parents to 
children or by the fact that parents “lose the taste” for the good once 
they have children. Thus, though the reservations from this assump¬ 
tion are numerous, separability is required (in the absence of direct 
information) for the measurement of the resources devoted to chil¬ 
dren and hence adult equivalence scales. 

Given this assumption, it has been shown, there is only one way of 
identifying the family’s distribution rule (eq. [12}) using the consump¬ 
tion patterns of adult goods as identifiers. 

There is no way of justifying the choice of the share of food in total 
consumption as an indicator of welfare within the framework of stan¬ 
dard economic theory. As argued earlier, this method will generate 
similar (though not identical) results if any other good is used, as long 
a* it is income inelastic and its demand increases with the number of 

l * The method was applied by Nicholson (1949) and many others. Lazear and 
Michael ( 1985 ) reach similar conclusions, though in a different fashion. 1 have used the 
^“°d to estimate-the intrafamily allocation rule k in an accompanying paper (Gronau 
*™ 5 ). The method differs from that of Nicholson technically but not conceptually. 
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children. The estimates generated by this method are goods-specifk: 
they involve a mixture of parameters specific to the consumption of 
food and relating to the intrafamiiy distribution rule and do not allow 
the separation of the two. Even more acute, if the marginal propen¬ 
sity to consume food is the same for parents and children (as is often 
implicitly assumed), the estimates reflect solely factors that relate to 
food consumption and cannot teach us anything about the distribu¬ 
tion rule. 

The Barten scheme is more sophisticated, comparing the consump¬ 
tion patterns of all goods to estimate the adult equivalence scales. This 
scheme assumes that the underlying utility function (eq. []]) does not 
change with family composition and that the goods-specific deflators 
mi,- (eq. [1]) are independent of the quantities consumed, that is, of 
prices and income. Can one justify these assumptions in the light of 
the discussion here? 

The two assumptions are not independent. One interpretation re¬ 
gards the utility function in equation (1) as the outcome of a family 
"consensus” and assumes that the process by which this consensus is 
reached yields the same formal functional presentation in spite of the 
changes in family composition. The parameters m, in this case are 
deflators that depend on the outcome of the new consensus; that is, 
they reflect the fact that m, units of q, consumed by a family of compo¬ 
sition k generate the same utility that units of the same good will 
generate for a family of composition Jko. Unfortunately, since this 
model does not specify how the consensus is reached, one cannot 
judge the validity of the assumption that the deflators m, are exoge¬ 
nous. It is also not clear that the parameters m, are related in any 
way to the intrafamiiy allocation system or to the various members’ 
needs. 30 

Thus if, for example, the coefficient w, for cola is 3 in a family of 
four and 2 in a family of two, the only inference one can draw is that a 
family of four is as happy consuming three bottles of cola as a family 
of two consuming two bottles. It is not specified how the cola is di¬ 
vided among the four people. Hence, if Junior consumes all three 
bottles, it cannot be deduced whether this is because he needs all that 
cola or because he has a veto vote in the family consensus about cola. 
By this interpretation one cannot estimate the values of m, by observ¬ 
ing die intrafamiiy allocation. But even if these parameters can be 


*° Poliak (1985, p. 598) in a somewhat similar context writes that “because Samuel- 
son’s 'consensus’ is postulated, not derived, his family is simply a preference ordering 
Sanvudson’s concern is to keep the Kd on the 'black box,’ not to look inside.” In h» 
review paper PoHak discussed the alternative approaches: Becker’s "altruistic despot 
and the bargaining approach. 
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estimated from data on overall consumption, one cannot separate 
needs from wants (i.e., the factors shaping the new consensus). 

An alternative approach is the one adopted in this paper in which it 
is assumed that the utility function (1) is that-of the parents. 21 In this 
case it is tempting to interpret «u as goods-specific needs (p<). But the 
superficial similarity of equation (1) to (4) and (5) is misleading. There 
is no way of separating needs from wants. If one interprets Barten’s 
welfare concept as the welfare parents derive from their own con¬ 
sumption (i.e., U = U A in eq. [1]), the goods-specific scales will mea¬ 
sure (the reciprocal of) the share of parents' consumption in the 
specific good («, * qJq A ). 2i This share depends both on home tech¬ 
nology and on preferences. 

To justify the assumption that m, are exogenously given and inde¬ 
pendent of prices and income, one has to assume that q A and qf are 
consumed in fixed proportions. The ratio qf"lqf is insensitive to price 
if the same elasticity of substitution among goods prevails in the indi¬ 
vidual welfare functions U A and U B (eqq. [4] and [5]) and if the elastic¬ 
ity of substitution between t/ A and lf B is unitary. The ratio is insensi¬ 
tive to changes in income if q A and qf have the same income elasticity 
(with respect to X A and X a , respectively) and if the share of parents’ 
and children’s consumption is constant. It is hard to believe that these 
harsh conditions are satisfied. 

Once the assumption is discarded that the same good is consumed 
in fixed proportions by parents and children (i.e., q A /qf is constant), 
the coefficients m, turn out to be the outcome of the decision both how 
to allocate resources among parents and children and how to allocate 
the household members’ budgets among the various goods. However, 


once it is recognized that m, are endogenous, one has to abandon the 
notion of “Batten prices” and the conclusion that demographic 
changes are associated with relative price effects. 23 


One can estimate i», by estimating separately the demand of parents 
and children (g A and gj). But a prerequisite for the identification of 
g? is the identification of the allocation rule A (rather than vice versa). 


The estimation of the goods-specific measures m, (i.e., the estimation 
°f g?) therefore seems to be redundant as far as welfare comparisons 


are concerned. 


1 Barten himself was mostly interested in the question whether price elasticities can 
oe estimated from cross-section studies. He never specified the exact nature of the 
household utility function. 

To quote Deaton and Muellbauer (1986), “the most attractive is to regard u as a 
measure of the parents’ [welfare] so that [?/m,] is the consumption of good i that 
actually reaches the parents when an amount a, is purchased for the family as a whole" 
<PP 735-56). 

It is hard to believe that a family expansion is accompanied by a substitution from 
c °la to scotch because the latter has become relatively cheaper. 
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V. Adult Equivalence Scale* and Welfare 


Let us return from the intrafamily allocation rule to adult equivalence 
scales. The traditional measure (eq. [2]) can now be given a simple 
interpretation. The “cost” to a household with children (a household 
with composition A) of attaining a welfare level U A from parents’ 
consumption is that consumption level, X, that will assure the parents 
a budget of X A . The cost to a childless family (a household with 
composition Ag) of attaining that utility level is, naturally, only X A . 
Hence, the traditional scales are 


m _ C(U A , p, A) _ X 
C(U A , p, A 0 ) ' X A ’ 


( 20 ) 


that is, the reciprocal of the parents' share in total consumption. 

Can this measure be used for welfare comparisons? Alternatively, 
can it serve as a basis for a compensation scheme? My answer, by and 
large, follows that of Poliak and Wales (1979). Given the restricted 
definition of welfare used in equation (20), any compensation scheme 
based on the traditional measure will overcompensate parents be¬ 
cause it ignores the utility they derive from their children (both in the 
short and in the long run). 

But our reservations go even further. The standard procedure 
does not relate to the question of the factors explaining interfamily 
differences in size. In a model in which family size (i.e., fertility) is 
subject to family decision, these differences may be explained by dif¬ 
ferences in preferences, income (total consumption), prices (includ¬ 
ing the price of time), or fixed costs. 

Figure 3 depicts the first case, that of two households facing the 
same income (X 0 ), prices, and costs (C) but differing in “tastes” for 
children. Whereas household A prefers the corner solution X 0 , 
spending all its resources on parents, household B is not deterred by 
fixed costs and finds it optimal to have children, splitting its resources 
between parents and children. Compensating parents for the fixed 
costs of children will shift this household from B to E, improving the 
welfare of households with children, where there is no justification 
for compensadon. The household could have chosen X 0 but pre¬ 
ferred B to Xo to start with. 

There is even less justification for compensation when the differ¬ 
ence in fertility patterns is explained by a difference in income, prices, 
or fixed costs. For example, figure 4 describes two families with the 
same income, prices, and preferences but with different fixed costs 
(C A > Cb). Given the lower costs, household B prefers to have chil¬ 
dren, while household A spends all its income on parents’ consump¬ 
tion. Naturally, B is better off than A. Compensating parents for the 
fixed costs will only increase B’s advantage. 
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The same general conclusion holds true when fertility is not subject 
to the family’s discretion. Thus in figure 3, if a household with prefer¬ 
ences V K has unplanned (and unwanted) children, it will move from 
X 0 to an inferior point B. However, when this household is compen¬ 
sated for the fixed cost, it moves to £, which is superior to Xo. 

Finally, using a compensation scheme based on adult equivalence 
scales assures the family of an income so that at the optimum it spends 
X 0 on parents' consumption. This scheme moves parents on the ex¬ 
pansion curve X A * h(X) from B to F. Needless to say, it highly 
overcompensates households for having children. The compensation 
is larger the steeper A(X), that is, the more parents "invest” in (or 
consume through) their children. v --. 

Not only are adult equivalence scales/ndl very useful in serving the 
aims of compensatory policy, they are iriso of little use in the compara¬ 
tive statistical study of standards of living of families of different sizes. 
Naturally, X/X A increases the more resources parents decide to divert 
to their children. Hence, the more parents invest in their children, the 
lower their “deflated” measure of living standards. 


VI. Concluding Comments 

The long-lasting controversy surrounding adult equivalence scales 
sometimes leaves the impression that there is no way of choosing 
among the various estimation methods. Any method can be justified 
by making the appropriate assumptions about the underlying welfare 
function. It is hoped that this paper has dispelled this impression. 
Instead of inferring the welfare function from the estimation proce¬ 
dure (an exercise that results in pretty odd welfare statements, e.g., 
utility s* food), it tries to generate the estimation procedure from the 
basic principles of theory. Noting that a prerequisite for the compari¬ 
sons of parents’ welfare across households differing in size is the 
separability of the welfare that parents derive from their own con¬ 
sumption and the welfare they derive from their children’s, I have 
argued that the key to the estimation is the identification of parents’ 
consumption. In the absence of direct information on the share of 
parents in the overall family budget, one has to impute it from con¬ 
sumption patterns, given the restrictions of the underlying model. 
Resorting to separability, I have shown that a necessary condition for 
identification is the existence of some goods that are consumed exclu¬ 
sively by parents. 

Thus if one assumes that the welfare concept used as a yardstick in 


14 The only exception to this conclusion is the case in which A would not have chosen 
to have children even in the absence of fixed costs (when is sleeper than the slope o 
the price line at Xo). 
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the derivation of adult equivalence scales is parents' welfare, the only 
method of estimation that is consistent with the assumptions of the 
model is, in my opinion, the Rothbarth method, which tries to derive 
these scales by observing the effect of children on the consumption 
patterns of "adult goods.” Any other method 1 (e.g., the Engel or the 
Barten methods) reflects factors that are goods-speciflc and that relate 
to the intrafamily allocation decision, with no way of separating the 
two. 

The analysis’s strength is also its major source of weakness. It stands 
or falls on the assumption of separability, an assumption that prima 
facie looks incongruous in the family context. One way of evading this 
problem is to adopt a more vague notion of the welfare function, for 
example, regarding it as the outcome of the family "consensus,” 
which does not change with family composition. It is not clear, how¬ 
ever, that in this case one can derive the equivalence scales by observ¬ 
ing the intrafamily allocation of goods. Furthermore, even if the 
values of m, can be estimated from consumption data, one cannot 
separate the various family members' needs from their power in de¬ 
termining the family consensus. In the Anal outcome, both assump¬ 
tions on the nature of the utility functions raise serious problems. The 
question which reservations are more severe should be put to an 
empirical test. 25 

The analysis here, though successfully identifying the best method 
of estimation, also points out its shortcomings. Basing the estimation 
on observed consumption behavior shifts the emphasis from the deri- 
vation of an objective "needs deflator” to the estimation of a be¬ 
havioral function describing the allocation of resources within the 
household. Needs and other factors associated with consumption 
technology, such as fixed costs of children and returns to scale in 
consumption, affect this allocation but can rarely be isolated from the 
estimates of the allocation function. This seriously limits the use of 
these estimates for welfare comparisons. 

Even more harmful to the use of adult equivalence scales in welfare 
comparisons is the restricted nature of the welfare definition used in 
the analysis. The analysis focuses on the welfare parents derive from 
their own consumption, ignoring the welfare they derive from having 
children. Thus this "conditional" measure is almost useless in welfare 
comparisons; using it as the base for a compensating scheme may 
overcompensate households that are (even before compensation) bet¬ 
ter off (e.g., face lower costs of children and lower prices or have 

** riie assumption of separability is tested in Gronau (1985) and passes the test 
’“ccessfully. Similar'tests are provided by Deaton, Rtiir-Castilio, and Thomas (1985). 

best of my knowledge the implicit assumptions of the alternative approach—(a) 
tenia eon,e,uu, dbe» not change and (5) that *•,- are exogenous—have never been 
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higher incomes). When the measure is used to compare living stan¬ 
dards, it assigns (other things being equal) a lower standard to a 
family that decides to invest more in its children. 

Now that the traditional equivalence scales have been disposed of as 
indices of well-being, what indices should be used as guidelines for 
public policy? In our haste to “calculate numbers,” we have appar¬ 
ently neglected to question not only the nature of the identifying 
assumptions but also the goals of public policy. Kuznets (1982, p. 737) 
has already pointed out that 

available conversions for a shift from per person to per con¬ 
suming unit bases are all derived from the empirical data 
which reflea the adjustments to reduced income per per¬ 
son—rather than the consumption needs of children viewed 
as future members of the next adult generation of produc¬ 
ers. Our interest is in the reduced economic base for the 
children in terms of what this base, and the lower income of 
associated adults in the family, means for the capacity of the 
children when they reach adulthood, to contribute to social 
product. 

Furthermore, it may even be argued that children’s future produc¬ 
tive capacity should not be the concern of public policy unless social 
returns differ from private returns or social costs differ from private 
costs. Were this the case, one should subscribe to the traditional eco¬ 
nomic formula of equating private and social costs (and returns) 
through appropriate taxes and subsidies. In effect, the standard 
justification of government expenditures on health and education is 
bridging the gaps between private and social rates of return on invest¬ 
ment in human capital. Tax deductions that depend on family size are 
essentially a way of subsidizing larger families. It is, therefore, not at 
all dear that there is a need for the adult equivalence scales to guide 
public policy (as distinct from information on private and social costs). 

Economists seem to have been carried away by the analogy betweer 
adult equivalence scales and price indices. “A penny bun costs three 
pence when you’ve a wife and a child” is the way Gorman (1976 
introduced adult equivalence scales. Had he phrased his complain 
differently, bemoaning the fact that he receives only one out of thre 
buns consumed by his family, much of the ensuing confusion coul 
have been avoided. 
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Risk, Futures Pricing, and the Organization of 
Production in Commodity Markets 


David Hirshleifer 

University of California, Los Angeles 


This paper examines equilibrium in a spot and futures market with 
both primary producers (growers) and intermediate producers (pro¬ 
cessors). For a commodity that is subject to output shocks, processors 
tend to hedge long, in contrast with Hicks's theory of futures hedg¬ 
ing. Nevertheless, if transaction costs are low, the two-stage produc¬ 
tion process brings about a downward futures price bias, consistent 
with Hicks’s pricing prediction. But if costs of trading futures are 
high, growers tend to be differentially driven from the futures mar¬ 
ket, reversing the direction of the bias. Futures trading may also 
affect the organization of industry; when demand is inelastic, futures 
trading can serve as a substitute for vertical integration as a means of 
diversifying risk because the risk positions of growers are com¬ 
plementary with those of processors. 


I. Introduction 

The classical literature on futures hedging and price determination, 
from Keynes (1927) and Hicks (1939) through Telser (1958) and 
Cootner (I960), focused on the futures trading decisions of pro¬ 
cessors or storage firms, to the exclusion of growers. The source of 
risk was viewed as variability either in the cost of purchasing the raw 
commodity input or else in the value of the output produced. Fur¬ 
thermore, the production level was taken as fixed so that revenue (or 
input cost) was assumed to be perfectly correlated with the price of 
the output (or input). 


I would like to thank Eduardo Schwartz, Brett Trueman, Avanidhar Subrih 
man yam, Peter Carr, Warren Bailey, Dani Galai, and especially the referee for help 
ful comments. 
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Several more recent writers have included growers in their analyses 
(see McKinnon 1967; J. Hirshleifer 1977; Newbery and Stiglitz 1981; 
Britto 1984). This has permitted the analysis of futures hedging and 
equilibrium in which quantity risk and price risk jointly determine the 
overall revenue variability of producers. However, most of this litera¬ 
ture, while properly assigning a role for growers, does not consider 
intermediate processors of the commodity. 

The few exceptions, papers that cover the decisions of processors as 
well as growers, are all incomplete in important respects. O’Hara 
(1985) examines Hicks’s theory but does not explicitly set out the 
hedging decision problem of producers and is not specific about the 
nature of the contracts examined. 1 Anderson and Danthine (1983) 
assume that the raw commodity is costlessly transformed into the 
finished product. 2 Baesel and Grant (1982) consider processors under 
three scenarios each of which is special in crucially limiting respects. 3 
The current paper closes the model by making futures trading and 
spot purchases endogenous choices for outside speculators, for pri¬ 
mary suppliers (growers), and for intermediate handlers with increas¬ 
ing marginal costs of processing the commodity. 

The analysis here leads to a set of predictions that differs from 
those in both the classical literature and more recent papers on hedg¬ 
ing and futures prices. In contrast with Hicks’s theory of hedging, I 
show that processors tend to take long positions. However, as long as 
transaction costs are low, the presence of a second production stage 
promotes downward price bias, which is consistent with Hicks's pric¬ 
ing prediction. On the other hand, when costs of trading futures are 
high, growers tend to be differentially driven from the futures mar¬ 
ket, reversing the direction of futures price bias. 

The classical writers, who focused their analyses on intermediate 
producers, were aware that most hedging on organized futures ex¬ 
changes was done by intermediate handlers of the commodity (pro¬ 
cessors and storage firms), not growers (Paul, Heifner, and Helmuth 
1976). However, one reason for low participation in futures markets 
by growers is that they engage in forward trading (individualized 
contracting between producers), which acts as a substitute for hedg¬ 
ing on the exchange. Growers, storers, and processors of grain are 

' O'Hara’s pricing results do not apply to futures contracts, which are based on 
delivery of a fixed quantity of the commodity. Possibly the assertions of the paper refer 
to crop share contracts. 

inputs a ^* OW ®° r P°***hle spoilage of the commodity, but not for any other costly 

s Under the first scenario, there is no futures market. In the second, processors are 
Msumed to have nonrandom profit margins, so that there is no risk to be hedged. In 
processors sell forward their product to unidentified third parties outside the 
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connected by a web of extensive forward trading. 4 To understand 
commodity market equilibrium, one must examine not just the fu¬ 
tures market in isolation, but what is really a single joint futures/ 
forward market. 5 

A second reason why many growers do not hedge on the organized 
exchange is that farms are small businesses, so that costs of learning 
how to use futures contracts may be a significant deterrent to trading 
futures. And of course, the direct transaction costs are a heavier 
burden on small rather than large traders, in both forward and fu¬ 
tures contracting. 

Rather than assuming either complete participation or complete 
absence of growers from the futures/forward market, this paper al¬ 
lows for the likelihood that for the reasons just stated some growers of 
at least some commodities may effectively remain unhedged. I dem¬ 
onstrate that the degree of participation in the futures market by 
different groups (speculators, growers, and processors) is an impor¬ 
tant determinant of how futures contracts are priced. 6 

A relatively unexplored topic is the extent to which financial risk- 
sharing markets affect the industrial organization of product mar¬ 
kets. I show that the risks faced by processors versus growers may be 
complementary, which leads to a risk-reducing benefit to forward 
contracting or to vertical integration. This analysis casts light on 
whether futures trading can effectively serve as a substitute for verti¬ 
cal integration as a means of diversifying risk. The remainder of the 
paper is structured as follows. The economic setting is laid out in 


* Paul et al. (1976) and Hehnuth (1977) describe how scorer* and processors who 
contract forward with farmers typically hedge their commitments, by either futures 
trading or a forward contract with a buyer at the next level. 

9 However, futures and forward contracts are not perfect substitutes; among their 
differences is the daily resettlement ("marking to market") feature of futures contracts. 
A number of authors have shown that if daily interest rates are nonstochastic, then 
futures and forward prices must be identical (Cox, Ingersoll, and Ross 1981; Jarrow 
and Oldfield 1981). More generally, futures and forward prices should be very close 
since their differences are due to shifts in the timing of cash flows over periods of only a 
few months. Cornell and Reinganum (1981) and French (1983) found empirically that 
the differences between futures and forward prices for metals and foreign exchange 
were small and were not explained by models of the daily vs. terminal settlement 
features. 

* Two significant criticisms of the traditional hedging pressure approach to futures 
pricing that 1 will not address here are that (1) it excludes risky assets other than a 
futures contract, ruling out the diversification effects underlying the capital asset pric¬ 
ing model, and (2) with many outside risk bearers, the impact of hedging on prices 
should be small (Teiser 1958), In a hedging pressure/capital asset pricing hybrid model 
(D. Hirshleifer 1988), I have argued that even in a large capital market, hedging by 
undiversified producers can influence futures prices, an influence that is strengthened 
by nonparticipation by outsiders in the futures market. In such a setting, the futures 
price bus depends both on hedging pressure effects and on the “stock market risk” of 
die futures contract (see also Stoll 1979), 
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Section II. Market equilibrium is determined in Section III, which 
contains all the paper’s results. Section IV concludes the paper. 

II. The Economic Setting 

There are four types of individuals: (1) pure consumers (who are 
involved neither in supplying the good nor in trading futures), (2) 
outside speculators (who are not suppliers, but who trade in futures), 
(3) growers (primary suppliers of die raw commodity Y), and (4) 
processors (intermediate suppliers of the final good Z). All markets 
are assumed to be competitive, and beliefs about distributions are all 
agreed and match the corresponding actual distributions. A tilde will 
be used to stress that a variable is random as viewed from date 0. 

The pure consumers enter only the terminal (date 1) spot market 
for the finished commodity Z. Since there is no need to model their 
optimizing decisions direcdy, they are represented only by the market 
demand curve 

£ = t(P z )\ (i) 

where £ is aggregate demand for the finished good, P z is its spot price 
at date 1, and t) is demand price elasticity. 

The optimizing decision for the “traders” (a term reserved here for 
the three classes of individuals other than pure consumers) involves 
maximizing a mean-variance utility function 7 

(/-£(0 - yvar(f), (2) 

where £ is terminal (date 1) consumption, and a is absolute risk aver¬ 
sion, the same for all traders. 

Trading occurs at two dates in the model. The sequence of events is 
as follows. The primary production (planting) decisions have been 
made before the analysis proper begins, but the actual output of Y is a 
stochastic variable whose realization will not be known until date 1. At 
date 0 the futures market 8 opens, generating a futures price P° for 
the raw commodity Y as a result of the trading decisions of the grow¬ 
ers, processors, and speculators. At the final date 1 the following 
events occur: (i) the realization of the stochastic output of Y becomes 
universally known; (ii) deliveries and financial settlements are made 

7 This applies under the assumptions of normality and constant absolute risk aver- 
non preferences. Similar results would apply without constant absolute risk aversion by 
of Stein's lemma (Stein 1973) or without normality assuming quadratic utility. 
The term “futures” is to be interpreted as covering forward contracts or markets as 
* * two-date setting, the distinction between daily resettlement (futures contracts) 
expiration date settlement (forward contracts) vanishes. 
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on any outstanding futures contracts; (iii) a spot market for the raw 
good Y opens, in which the intermediate processors purchase the 
entire realized output of Y at an equilibrium date 1 price P Y ; (iv) 
intermediate processing converts Y into finished good Z; 9 (v) the pro¬ 
cessed output is sold to consumers at the equilibrium final price P z . 

The superscripts G, A, and 5 will identify growers, processors, and 
speculators, respectively. There are no, n A , and n s of each type of 
trader. For typical individuals in each category the initial wealth en¬ 
dowments, in terms of a numeraire commodity (say dollars) apart 
from whatever value attaches to their initial holdings or entitlements 
of Y or Z, are w G , a/ 1 , and uP; wealths could differ within each group 
as well without affecting any results. In addition, the growers (and 
only they) are endowed with a risky distribution q° of the raw com¬ 
modity Y. By convention, one unit of the raw good transforms to one 
unit of finished good. The typical processor therefore purchases the 
raw commodity and sells the finished commodity in necessarily equal 
amounts q A . Finally, let net revenue from spot purchase or sale of Y 
or Z, as the case may be, be symbolized by R', where 

R s * 0, 


R g = P Y f, (3) 

r a « (P z - P r )i A ~m A ), 

and where f(q A ) is the processing cost function, with /',/" > 0. 

For each group, let £' (i — G, A, S ) represent the size of the date 0 
futures position, and let t' be the fixed cost of trading. 10 Then the 
consumption constraints for the optimizing decisions of a grower, 
processor, or speculator all take the form 11 

* In a more realistic model, processing would take time, which would require a third 
date after the purchase of the raw good. The current simplified model captures some 
essential points that would be unaffected by the introduction of storage in the inter¬ 
mediate process as long as the process does not carry over into a second stochastic 
harvest. There has been considerable modeling of carryover across harvests (Newbcry 
and StigKtz 1982; Scheinkman and Schechtman 1983; Turnovsky 1983), a topic not 
addressed here. Because of serial interactions of risk between harvests, the problem of 
carryover calls for a multiperiod consumption framework. 

10 The fixed cost represents such one-time setup costs as learning about contracting 
procedures and establishing trading contacts, which are likely to be more important as 
deterrents to trading than explicit brokerage fees. A different interpretation is that the 
fixed cost represents the minimum investment in learning needed to avoid trading at 
an informational disadvantage in the futures market. 

11 The consumption constraint rules out sale of equity shares in the producer's busi¬ 
ness; in fact most U.S. farms and agricultural firms are closely held. The imperfect 
marketability of revenue risk leaves producers with an incentive to hedge using futures. 
Scale economies in going public, as well as moral hazard and advene selection prob¬ 
lems, may explain the limited equity issuance by small producers. Even in widely held 
firms, optimal contracts that impose risks on managers may provide an incentive to 
hedge the firm's risk using futures (Diamond and Verrecchia 1982). 
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__fu> -* + /? + (P v - J 50 )! if trade futures ... 

|u> + R otherwise. ' ' 

Growers are assumed to sell all their output at the raw good price 
P Y and the processors all theirs at the finished good price P z , with 
setdement of all futures contracts taking place by financial balancing 
rather than actual delivery and acceptance of the good. It will also be 
assumed that all the traders in aggregate are only a negligible factor 
on the demand side of the final-product market. Thus the aggregate 
demand for the finished good Z is assumed to be unaffected by the 
various parties’ gains and losses in futures trading, as well as the 
realization of the output of the raw good Y. 

III. Industry Structure, Market Participation, and 
Futures Pricing 

A. Spot Market Equilibrium 

Let us begin with equilibrium in the spot market for the raw and 
finished commodity at date 1 and then move backward to the futures 
hedging decisions and equilibrium at date 0. After output uncertainty 
has been resolved, consumption variance is zero, and the processor’s 
production decision is to select q A (P Y , P z ) to maximize net revenue. 
The processing cost function in (3) is assumed to be quadratic, f(q A ) 
= 7o q A + (Yi/2 )(q A f, where y 0 , 7i > 0. 12 

The optimality condition price = marginal cost yields an optimal 
output of 



I assume that all the raw output is processed, so that aggregate 
demand by processors for the input is equal to the total quantity 
supplied by growers, n A q A = Q. So in equilibrium the spread between 
final and raw product price by (5) is 

P* ~ P Y = 7o + Yi — (6) 

Note that higher Q raises the spread between the raw and finished 
product price. Intuitively, a greater demand for processing services 
increases the wedge between input and output prices and, as will be 
shown below, the rents to processors. 


11 The quadratic cost function is especially tractable because it leads to linear first- 
order conditions, but similar results would hold more generally. 
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By (1), the equilibrium spot price for the finished good is 



which with (6) determines the raw product price 



Having described the spot market equilibrium at date 1,1 will pro¬ 
ceed backward to determine futures hedging choices at date 0 in the 
next subsection. Then, in Section IIIC, 1 will examine how futures 
prices are determined at date 0. 

B. The Futures Hedging Decision 

Let us now turn to the date 0 futures hedging problem for growers, 
speculators, and processors. The futures position is found by max¬ 
imizing expected utility in (2) with respect to subject to the upper 
equality in (4) for a grower, processor, or speculator. Expected utility 
becomes 


U ~ w — t + E(R) + £E(P y - P°) 

- (var(K) + I st var {P r ) + 2£ cov(«, P y )]. ^ 

Let the futures price bias be defined as B — P° - E(P ¥ ). A down¬ 
ward-biased futures price (B < 0) is often called “backwardation” and 
an upward bias “contango.” Differentiation yields an optimal futures 
position for any of these types of traders if he takes a futures position 
at all, which takes the form 

e = _ co \(R,P y -P°) + (B/a) (iQ) 

var (P Y ) 

The decision whether or not to trade futures is made by comparing 
the expected utility that arises from either alternative (see [14] below). 

For a speculator the covariance term vanishes, so that fj 5 is positive 
or negative according to the sign of the futures price bias B, In the 
absence of covariation between profits and the futures payoff, the 
futures contract cannot be used to hedge, so the optimal position will 
be long or short depending on whether a long position in the com¬ 
modity generates an expected profit or loss. 

Rolfo (1980) and Newbery and Stiglitz (1981) have pointed out that 
in mean-variance hedging problems, the optimal futures position 
contains two components, one for risk reduction and one to exploit 
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the expected profit that can be achieved when bias is nonzero (the first 
and second terms in the numerator of [10]). If B were zero (unbiased 
futures prices), there would be no expected profit to either a long or 
short position, so only the risk reduction component would remain. 
In this case, the direction in which processors hedge is given by the 
sign of the covariance term, which is found by seeing how R A and P Y 
covary across aggregate output states. By (5) and (6) 


dR A 
dd 



(ID 


Thus net revenue increases with aggregate output, so that processors 
do best when spot prices are low. By (8), P Y declines with output. 
Therefore, R A and P Y - P° are inversely ordered in the sense of 
Hardy, Littlewood, and P61ya (1952) (as one goes up, the other goes 
down), so co v(R a , P y — P°) < 0. It follows by (10) that as long as bias is 
nonpositive, processors hedge long, in contrast with Hicks’s theory 
and O’Hara’s further development of it. 

Proposition 1. When output is stochastic and there is a positive 
marginal cost of processing the commodity for resale, if bias is not 
upward (B £ 0), processors take long futures positions. 

Processors tend to hedge long because their net revenues (and also 
their revenues gross of processing costs, [P z — P Y ]q A ) are highest 
when output is high and price low. The specialized resources of pro¬ 
cessors are most valuable when the crop is plentiful because this is 
when the demand for their services is highest. So their profits are 
inversely related to the futures payoff, and a long hedge reduces 
risk. 13 

If demand elasticity were unitary, then by (1) gross revenue re¬ 
ceived by producers as a group, P Z Q = 8, would be a constant. Or if 
demand were inelastic ( 7 ) > -1), by (7), revenue P Z Q_ = 8 _(1/T1) Q 1+<1/1, ' ) 
would decrease with Q. (Intuitively, when demand is inelastic, price 
fluctuations are large compared to output fluctuations in percentage 
terms, so that total revenues are high when output is low.) But by (6), 
multiplying on the left and right by n A Q A = Q, we see that gross 
revenues to processors are increasing in Q. It follows in either case that 
gross revenue to growers is lowest when output is high, a complemen¬ 
tary risk position. Recalling by (8) that P Y decreases with Q, we see that 
the covariance in (10) is positive, so that without a bias, growers hedge 
short. 


15 The hedging demand for futures by processors is akin to the “convenience yield’* 
®n storage described by Newbery and Siigiiu (1981, p. 196). They observe that a 
producer whose revenue covaries inversely with the spot price has an incentive to 
reduce risk by storing, i.e., a long spot position. 
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Corollary. If demand is inelastic or unitary elastic, the risk reduc¬ 
tion component of each grower’s optimal hedge is negative. 

In models without processors, the risk reduction component of a 
typical grower’s hedge is positive or negative according to demand 
elasticity and is zero for unitary elastic demand. Here, even with 
unitary elasticity, growers hedge short, so the presence of a second 
production stage promotes short hedging by growers. Basically, pro¬ 
cessors absorb a profit wedge that covaries negatively with the spot 
price, which would have accrued to growers if die good sprang from 
the earth in finished form. 

A relatively unexplored topic is the relation of futures trading to 
industry structure (see, however, Carlton 1983). We have just seen 
that under inelastic or unitary elastic demand, growers and pro¬ 
cessors have complementary risk positions. In the absence of a futures 
market, there is an incentive to vertically integrate the negatively 
correlated payoffs of a grower and a processor. This could be done by 
combining the farm and processing assets under a single ownership 
or by forward contracting between the grower and the processor (for 
either fixed quantities or variable crop shares). 14 Opening a futures 
market introduces an alternative means by which these groups can 
transfer risk. It thereby encourages a greater degree of productive 
specialization. Thus organized futures trading can act as a substitute 
for vertical integration or share contracting as a means of diversifying 
risk. 15 

With sufficiently elastic demand, on the other hand, P Z Q will in¬ 
crease with output, so that growers also will do best when output is 
high. In this case risk positions are no longer complementary; futures 
trading, instead of substituting for vertical integration, promotes it. 
The ability of integrated producers to trade futures then mitigates the 
adverse effects of combining positively correlated risks. 


C. Futures Market Equilibrium 

Let us next examine equilibrium futures price bias as a predictor of 
the later spot price (B). The clearest baseline case is one in which 
demand elasticity is unitary. In a mean-variance model in which the 
commodity can be immediately transferred from primary producer 


14 Historically, fanning cooperatives for wheat have owned a significant share of the 
U.S. wheat elevator business. Paul et al. (1976) describe the agreements of vegetable 
shippers and canners with fanners deducting the packing and processing costs from 
the proceeds of tales and then giving the residual return to the grower. 

16 It is interesting that the rise of the oil futures markets occurred in the 1970s, when 
the rise of OPEC segregated the production stages of oil extraction from refining in 
what had been a highly vertically integrated industry. 



production in commodity markets 


1215 

to consumer without processing, bias is upward or downward accord¬ 
ing to demand elasticity, so that unitary elasticity leads to zero bias. 16 
The intuition is very simple. With unitary demand elasticity, price is 
inversely proportional to aggregate output, so for a typical grower 
revenue is nonrandom. This eliminates any hedging pressure effect 
on the futures price. Let us turn to a commodity that must be pro¬ 
cessed for final consumption. Superficially it might appear that the 
bias should now be upward since, as shown in the preceding subsec¬ 
tion, processors have an incentive to hedge long. 

We now find the equilibrium price bias, taking as given the number 
of traders of each type in the futures market. It is assumed initially 
that t A and t G are zero so that all producers trade futures. But 
speculators may be deterred from participating fully by transaction 
cost f s > 0 so that A s ^ « s actually trade. 17 Later I allow for the 
possibility that producers are also deterred by the trading cost. The 
bias in the futures price may be found by employing the market- 
clearing condition that the individual futures positions sum to zero: 

0 = n A i A + nd G + *st S . (12) 

Substituting the optimal futures positions £' for the different traders 
from (10) using (3) and noting that, with none of the commodity 
wasted, n A q A = n G q c gives 18 

B = - --^-cov|>a - n A /(-|-), P Y - P°l (13) 

The direction of the bias is determined by the sign of the covariance. 
In the case of unitary demand elasticity for the finished good, the 
product P Z Q is nonstochastic. Since f(QJn A ) is increasing in Q and P Y 
is decreasing, by similar ordering the covariance is positive. It follows 
that B < 0 (downward bias) and, by proposition 1, processors hedge 
long. In other words, two-stage production leads to backwardation, not 
contango! 

If demand is inelastic, the futures price is still downward biased 
because if > -1, by (7) P Z Q decreases with Q, so that the two 
arguments of the covariance are still similarly ordered. Note also that 


16 Single-stage models relating bias to demand elasticity include Britto (1984) and D- 
Hirshleifer (1988). 

17 This asymmetric assumption is meant to reflect the fact that while futures or 
forward trading is common among producers, only a small minority of outside inves¬ 
tors trade commodity futures, either directly or through financial intermediaries such 
a* futures mutual funds. 

** If different traders have different risk aversions (a), then the same equation ob¬ 
tains with a repb&ed by the harmonic mean of the a"s of the traders on the futures 
market 
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since B < Q, speculators are long by (10), so by futures market clearing 
(12), growers are short. This shows the following proposition. 

Proposition 2. When demand for the finished good is inelastic or 
unitary elastic and there is a positive marginal cost of processing the 
commodity for resale, then processors hedge long, growers hedge 
short, and the futures price is downward biased. 

It is remarkable that even though processors are long in futures, 
the reverse of Hicks’s predicted hedge, the necessity of processing the 
commodity brings about a downward bias, confirming his conclusion 
about price. This seeming anomaly arises because the activities of 
processors affect the risks and hedging choices of growers. The sec¬ 
ond production stage leads growers to go short by an amount that 
more than offsets the long positions of processors. 

This may be seen more clearly by considering the profits of a verti¬ 
cally integrated producer who both grows and processes the commod¬ 
ity. The gross revenue he receives from consumers with unitary de¬ 
mand elasticity is nonrandom because price and quantity move in 
exact inverse proportion. But the total processing cost incurred rises 
with output, so net profit decreases with output. So a vertically inte¬ 
grated producer’s profit covaries positively with the spot price, which 
implies that a short hedge is risk reducing. To induce outsiders to 
take on risky long positions, a downward bias (backwardation) is 
called for. 

Combining propositions 1 and 2, we have shown that processors 
hedge long because their net revenues covary negatively with the spot 
price. Yet as argued above, a vertically integrated producer’s reve¬ 
nues would covary positively with the spot price. It follows that grow¬ 
ers’ revenues covary positively with the spot price. (The covariance 
would be zero if the commodity could be consumed without a second 
stage of production.) So for growers short hedging is optimal. The 
growers’ incentive to hedge short outweighs the impact of the long- 
hedging activity of processors, and hence, the futures price is down¬ 
ward biased. 

This contrasts with Anderson and Danthine's (1983) prediction, for 
which the second production stage was essentially irrelevant. They 
found upward or downward bias according to demand elasticity, 19 a 
result that, as mentioned at the start of this subsection, obtains equally 
in a framework with a single production stage. The discrepancy be- 

19 This refers to their “input flexibility case." In their inflexible input case, futures 
positions are taken after the production level of processors has been committed and 
output is nonrandom. They find that bias is signed according to the expected value of 
an exogenous shock to the demand for the raw commodity by intermediate firms- 
These trades are determined outside the model, so their analysis does not indicate in 
which direction the bias would ordinarily go. 
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tween their result and that of this paper arises from their assumption 
that the marginal dollar cost of processing the commodity is zero, 
whereas here it is positive. The effect of demand elasticity is still 
reflected here in the revenue (P Z Q) term of (IS), but here there is also 
a subtracted term (/) reflecting the cost of production. 

Nonparticipation by Growers 

Let us now consider how A s , A a , and Ac are determined in equilibrium 
to see if differences in participation by different kinds of producers 
are likely to affect the pricing of futures contracts. A speculator or 
producer’s decision whether to trade in the futures market is based on 
whether his expected utility is higher from trading or refraining. Let 
V(w; refrain) be the expected utility of an individual with wealth w 
who behaves optimally in his production decisions (if he is a grower or 
processor) and does not trade in the futures market. Let V(u> - t ; 
participate) similarly denote the expected utility attained by trading 
optimally in the futures market. If all speculators face the same trans¬ 
action cost I s , then the number of speculators A s is determined by an 
indifference condition that 

V s (w - t s \ participate) = K s (tw; refrain), (14) 

where the S superscripts denote speculators. 80 When t G , t A > 0, simi¬ 
lar indifference conditions determine A a and A c - 

Suppose for simplicity now that the transaction cost is the same for 
all three groups, t* = t G « t s > 0. Without detailed formal analysis, 
we can make an intuitively reasonable statement about which types of 
producers will tend to participate more or less. A hedger who by 
trading futures attains little risk reduction is less willing to pay a given 
fixed cost of trading. So (with the correlation of the hedger’s revenue 
with the futures payoff held constant) a producer with a large reve¬ 
nue variance to be hedged is less likely to be deterred by a given fixed 
transaction cost than one with only a small risk. 

Many processors are large-scale enterprises, unlike most growers. 
(Some economic reasons are mentioned by Newbery and Stiglitz 
[1981, p. 197].) For example, 52 percent of U.S. milling capacity is 
accounted for by die top four flour-milling Arms (Goldberg 1983). 
This suggests that just as more speculators than growers are driven 
from the futures market by transaction costs, 2 ' so more growers than 

*° If individual* have different transaction costs, then this condition will obtain only 
the marginal speculator. 

In a perfect.market, very many small speculators would hold a fraction of the 
futures contract to profit off of any bias, however small. A small transaction cost will 
suffice to deter most of them. 
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processors will be deterred from trading futures. Suppose that only 
&a < « c growers and A a < n A of the processors trade futures. If A c , is 
near zero and ft A is near n A , then while many short-hedging growers 
are driven out, the long-hedging processors for the most part remain 
in the market. 

The next proposition follows immediately. 

Proposition 3. With unitary elastic demand and a sufficiently 
large transaction cost that deters growers rather than processors from 
the futures market, the futures price is upward biased. 

Proof. Let gw Aq/{Aq + H A + *1$) and a m (A A lfic)g- Then following 
steps analogous to those leading to (13) gives 

B - -a[a cov(R A , P r — P°) + gcov(R G , P r - JP 0 )]. (15) 

We have already seen by (11) that the first covariance is negative. By 

( 8 ) 


dR c _ 1 dP r Q _ l ( 
dQ n G dd " n c \ y ° 


27iQ \ 

n A ) 


< 0 , 


(16) 


so the second covariance is positive (indicating the tendency for grow¬ 
ers to hedge short). If transactions costs lead A c to be sufficiently small 
relative to ft A , then the right-hand side of (15) will be positive, imply¬ 
ing a positive instead of a negative bias. Q.E.D. 

More generally, with inelastic demand a lack of participation by 
growers will tend to algebraically increase the bias, although it need 
not be upward. The intuition of proposition 3 is that the transaction 
cost affects the relative importance of hedging by growers versus 
processors, who have negatively correlated risks. We saw above that 
processors have a long hedging incentive (promoting upward bias) 
whereas growers have a short hedging incentive (promoting down¬ 
ward bias). So a sufficiently large transaction cost will lead to an up 
ward-biased futures price. 

The assumption that transaction costs drive growers from the fu¬ 
tures market is in the spirit of the classical theorists. Yet in this case, 
the result here of upward bias (contango) is the opposite of the down¬ 
ward bias (normal backwardation) their discussions predicted. This 
highlights the importance of modeling two-sided uncertainty with 
both price and quantity risk to fully describe equilibrium price deter¬ 
mination. 

It is worth stressing that the current analysis has focused on output 
shocks while assuming demand to be constant. For most agricultural 
commodities, output shocks are probably the dominant source of 
price variability. But when demand rather than supply is subject to 
shocks, hedging pressure will typically promote downward bias be¬ 
cause high demand will be good news for both growers and pro- 
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cessors. This leads to a short hedging incentive, tending to reduce the 
futures price. So a possible interpretation of Hicks’s backwardation 
theory is that it was intended for markets with stable output and 
stochastic demand. More generally, the tendency toward upward bias 
suggested here will tend to be muted if demand as well as output is 
stochastic. 


IV. Conclusion 

This paper has examined how futures contracts are priced for com¬ 
modities whose production is associated with quantity risk and that 
must be processed for final resale. The model allows for uncertainty 
in both the cost of purchasing the raw commodity and the price at 
which it can be sold. Furthermore, the ability of processors to vary 
their level of production in response to the prices they face affects 
their overall risk. The paper also includes costs of processing the 
commodity, which were shown to be crucial if the inclusion of a sec¬ 
ond stage of production is to make a substantive difference in the 
model. 

In the absence of transaction costs, intermediate producers hedge 
long, in contrast with Keynes’s and Hicks’s theories, but the bias in the 
futures price as a predictor of the later spot price is downward, 
confirming their pricing prediction. A lack of participation in the 
futures market by growers can reverse the traditional prediction, 
leading to upward bias, or contango. 

When demand is inelastic or only mildly elastic, processors and 
growers have complementary risk positions. This suggests that for 
closely held producers, there is a risk diversification benefit to vertical 
integration or to forward contracting. However, futures trading can 
act as a substitute for vertical integration as a means of reducing risk. 
Thus not only does the nature of the production process affect fu¬ 
tures pricing, but the presence or absence of a futures market can 
affect how production is optimally organized. 
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The Efficiency of Investment in the Presence 
of Aggregate Demand Spillovers 


Andrei Shleifer and Robert W. Vishny 

University of Chicago 


In the presence of aggregate demand spillovers, an imperfectly com¬ 
petitive firm’s profit is positively related to aggregate income, which 
in turn rises with profits of all firms in the economy. This pecuniary 
externality makes a dollar of a firm’s profit raise aggregate income 
by more than a dollar since other firms' profits also rise, and in this 
way gives rise to a “multiplier.” Since such multipliers are ignored by 
firms making investment decisions, privately optimal investment de¬ 
cisions under uncertainty will not in general be socially optimal. 
Under reasonable conditions, investment is too low. 


I. Introduction 

This paper analyzes investment decisions in the presence of mac¬ 
roeconomic externalities. Following the work of Blanchard and Ki- 
yotaki (1987) and Cooper and John (1985), 1 we study a model with 
aggregate demand spillovers, in which a firm’s profit is positively 
related to aggregate income, which in turn rises with profits of all 
firms in the economy. With this externality, a dollar of a firm’s profit 
raises aggregate income by more than a dollar since other firms’ 
profits also rise, and similarly a dollar of a firm’s loss reduces income 
by more than a dollar. Equivalently, there is a "multiplier” on a firm’s 
profit (or loss) in the determination of aggregate income. Moreover, 
such multipliers vary across states of nature, depending on how many 
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other firms benefit from a firm’s profit (or lose from its loss) in each 
state. Because firms ignore this variation of multipliers across states in 
making investment decisions, profit-maximizing choices need not be 
socially optimal. 

To set up a benchmark for evaluating economies with imperfectly 
informed firms, Section II presents a full-information economy. In 
our highly stylized model, each sector has a potential monopolist with 
access to a cost reduction technology. Each monopolist must decide 
whether to invest and obtain a low marginal cost or leave the market 
to a competitive fringe that has a higher marginal cost. The profit- 
maximizing choice depends on expected demand since only in a large 
enough market can an investment in unit cost reduction break even. 
Demand, in turn, depends on profits of other sectors since profits are 
distributed to the consumer and spent by him. Aggregate demand 
spillovers through the distribution of profits make firms interested in 
the productive potentials of firms in other sectors of the economy. 

In Section II, the realized distribution of cost reduction tech¬ 
nologies across sectors is publicly known. This knowledge enables 
each potential monopolist to compute the profits of potential monop¬ 
olists in other sectors and in this way to forecast aggregate profits and 
demand. He can then gauge the size of his own market and make an 
accurate investment decision. In the benchmark case of perfect infor¬ 
mation, the economy has a unique perfect-foresight equilibrium in 
which investment decisions are efficient. In other words, a perfectly 
informed planner would have each firm make the same investment 
decision as it does in the free-market equilibrium. 

In contrast, Section III analyzes the same economy, except now 
firms have imperfect knowledge about cost reduction opportunities 
of other sectors. Firms then have to make forecasts of aggregate de¬ 
mand based on their priors as well as observation of their own techno¬ 
logical opportunities. In this case, rational expectations equilibria ex¬ 
ist but are not, in general, unique or efficient. There are two sources 
of inefficiency. The first is the inability of firms to accurately condi¬ 
tion their investment choices on circumstances of other sectors since 
decisions must be made on the basis of imperfect information. An 
equally well-informed social planner would face the same difficulty. 

The second source of inefficiency stems from the divergence of 
profit-maximizing and constrained welfare-maximizing investment 
decisions in the presence of aggregate demand spillovers. A firm’s 
profits (losses) have a beneficial (adverse) impact on profits of other 
firms, and the firm ignores this impact in making investment deci¬ 
sions. Interestingly, this externality has no adverse welfare conse¬ 
quences in the certainty model of Section II because there a firm has a 
positive spillover effect on other firms if and only if it makes a positive 
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profit by investing. In the uncertainty case, in contrast, when a firm’s 
profit averages to zero across states, its spillover effect does not aver¬ 
age to zero. 

To see this, consider a marginal firm that expects to break even on 
average if it invests. When the state of the world turns out to be good, 
many other firms are investing in cost reduction and the marginal 
firm’s positive profit raises profits in all these sectors, giving its profits 
a high multiplier in the generation of aggregate income. When the 
state of the world turns out to be bad, only a few firms are investing in 
cost reduction, and the loss by the marginal firm spills over onto the 
profits of only a few firms, making the multiplier on that loss small. 
Overall, even though the marginal firm expects on average to break 
even, the impact of its decision to invest on expected aggregate in¬ 
come is stricdy positive. In this way, uncertainty about the productive 
potential of the economy in the presence of aggregate demand spill¬ 
overs gives rise to systematic underinvestment. 

Since our model is highly stylized, Section III contains a discussion 
of the generality of the underinvestment result. We also illustrate how 
the idea of variable aggregate income multipliers can lead to similar 
results in a dynamic context even without uncertainty about produc¬ 
tive opportunities. 


II. Hie Full-Information Economy 

The benchmark economy described in this section sets the stage for 
the subsequent analysis. It shares with the models to follow the as¬ 
sumptions about preferences, technology, and markets but uses a 
particularly simple information structure. 

Consider a one-period economy with a representative consumer, 
who has Cobb-Douglas preferences defined over a unit interval of 
goods. All goods have the same expenditure shares. Thus when his 
income is y, the consumer can be thought of as spending y on every 
commodity. He is endowed with L units of labor, which he supplies 
inelastically, and he owns all the profits of this economy. When his 
wage is taken as numeraire, his budget constraint is given by 

y - n + L, (1) 

where II is aggregate profits. 

Each good is produced in its own sector, and each sector consists of 
two types of firms. First, each sector has a competitive fringe of firms 
that convert one unit of labor input into one unit of output with a 
constant-returns-to-scale technology. In addition, each sector has a 
unique firm that has access to a cost reduction technology. This firm is 
in having access to that technology in its sector and hence will be 
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referred to as a monopolist (even though, as we specify below, the 
firm does not always operate). Cost reduction requires the input of F 
units of labor (required outlay), where F is drawn from the econo¬ 
mywide distribution H(F) and allows each unit of labor to produce a 
> 1 units of output. In this section, it is publicly known that H(jF) is 
the realized distribution of required outlays across sectors. Much of 
this paper examines the consequences of uncertainty about the real¬ 
ized distribution H. 

The monopolist in each sector decides whether to become a low- 
cost firm or to abstain from production altogether. He reduces his 
costs (“invests”) only if he can earn a profit. The price he charges if he 
produces equals unity since he loses all his sales to the fringe if he 
charges more, and he would not want to charge less when facing a 
unit elastic demand curve. When income is y, the profit of a monopo¬ 
list who spends F to reduce costs is 

ir = — — - * y - F « ay — F. (2) 


The monopolist invests as long asy 2 Fla. It is obvious from this that, 
in equilibrium, under the assumption that all firms expect the same 
aggregate income, if a firm with required outlay F invests, then all 
firms with required outlays less than F also invest. We assume that 
a- L - F min > 0, where F mm is the lower end of the support of H; that 
is, it always pays the best cost reducer to invest. 

A perfect-foresight equilibrium in this economy is given by the 
marginal firm with required outlay F* and income y(F*) such that (a) 
income y(F*) obtains when all firms with required outlays no greater 
than F* invest, and (ft) the marginal firm breaks even, that is, 

ay(F*) - F* = 0. (3) 

When all firms with required oudays no greater than F* invest, then 
aggregate profits are given by 



[ay(F*) - F]dH(F) 


ay(F*)H(F*) 



FdH(F). (4) 


Combining (4) and (1), we obtain the expression for income: 


y(F*) 


L-£ FdH(F ) 
1 - "aH(F*) 


(5) 


Equilibrium obtains at F* if (3) holds for income given by (5). 

The numerator of expression (5) is the amount of labor used in the 
economy for actual production of output, after investment outlays* 
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One over the denominator is the multiplier that recognizes that an 
increase in effective labor raises income by more than one for one 
since expansion of low-cost sectors also raises profits. To see this more 
explicitly, one can calculate that 

dy ( F *) = IT (F*)dH(F*) g 

dF* 1 - aH{F+) ’ ( ’ 

where ir(F*) is the profit of the marginal firm. When the marginal 
firm earns this profit, it distributes it to shareholders, who in turn 
spend it on all goods and thus raise profits of all cost-reducing firms in 
the economy. The effect of the marginal firm’s profit is therefore 
enhanced by the increases in profits of all cost-reducing firms result¬ 
ing from increased spending. Since there are H(F*) of such firms, the 
multiplier is increasing in the number of firms that benefit from the 
spillover of the marginal firm. The more firms invest, the greater is 
the cumulative increase in profits and therefore income resulting 
from a positive net present value investment by a marginal firm. 

For an alternative interpretation of (6), notice that since the price of 
labor is unity, the profit of the marginal firm, it(F*), is exactly equal to 
the net labor saved from its investment in cost reduction. The 
numerator of (6) is therefore the increase in labor available to the 
economy as a result of the investment by the F* firm in the cost 
reduction technology. In equilibrium, this freed-up labor moves into 
all sectors. However, its marginal product is higher in investing sec¬ 
tors than in noninvesting sectors. The more sectors investing in cost 
reduction—that is, the higher is H(F *)—the greater is the increase in 
total output resulting from the inflow of freed-up labor into these 
sectors. In fact, the denominator of (6) is just the average of marginal 
labor costs across sectors, which is clearly a decreasing function of 
H(F*). This interpretation connects (6) to (5), which explicitly states 
that income is a multiple of productive labor and that the multiplier is 
increasing in H(F*). 

Proposition 1 . The equilibrium exists and is unique. The number 
of firms investing in cost reduction is efficient at the given prices. 

Proof. Denote by ir(F|F* * F) the profit of the firm with required 
outlay F when only the firms with required outlays no greater than F 
invest. Call the investing firm with the highest required outlay the 
marginal firm, (a) Existence: Note that ir(F min jF* = F min ) = oL F min 
> 0. Either wI F^ lF* * Fm,*) a 0, in which case every firm investing 
is an equilibrium, or ir(F m » x |F* « F max ) < 0, in which case there exists 
an F such that tr(F|F* * F) * 0 by the intermediate value theorem. 
{b) Uniqueness: Investment by a firm making a negative profit re- 
duces aggregate income. Take an equilibrium with marginal firm F*. 
Now raise the number of investing firms in order of the magnitude of 
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their required outlays, starting with those just above F*. Since F* 
firms break even at die initial equilibrium, firms with F > F* lose 
money. Adding them can only reduce aggregate income, making in¬ 
vestments by each additional firm even more unprofitable. To find 
another equilibrium, however, income must be raised so that a new 
marginal firm with required outlay F** > F* can break even. Since 
adding investing firms with required outlays above F* only reduces 
income, this is impossible, (c) Efficiency: An investing firm adds to 
aggregate income (and therefore, at constant prices, to welfare) if and 
only if the firm’s profits are positive. Consider an investment rule in 
which some (possibly empty) subset of firms with F < F* do not invest 
and some (possibly empty) subset with F > F* do invest. Since all 
those with F < F* are making a positive profit in the F* equilibrium, 
eliminating any of them only decreases income. Now consider adding 
some firms with F > F* in ascending order of their F' s. Since income 
is no higher after some subset of firms with F < F* is eliminated, the 
lowest F > F* firms will make a negative profit from investing. This 
further decreases income, making investment by firms with higher F s 
even more unprofitable. Q.E.D. 

The efficiency result deserves a comment. According to expression 
(6), a firm’s spillover is positive if and only if its own profits are 
positive. Therefore, even though a firm deciding whether or not to 
reduce its unit cost ignores the spillover, it decides to do so only when 
the social planner would choose likewise. The multiplier changes only 
the magnitude of the effect of a firm’s investment on income, and 
not the sign. Under certainty, both second-best (constrained by mo¬ 
nopoly pricing) welfare maximization and profit maximization dictate 
that an investment be undertaken if and only if it earns a positive 
profit. 

The key assumption on preferences and technologies required for 
this result is that demand be sufficiently inelastic that a cost-reducing 
firm does not want to cut its price more than just below that of the 
competitive fringe. This means that an investing firm makes its profit 
by serving the same customers as the fringe but incurring lower costs. 
Any profits it makes result from using less labor to produce the same 
output, and not from expanding its sales. The situation becomes 
more complex when demand is elastic, and cost-reducing firms cut 
their prices substantially below those charged by the fringe. By cut¬ 
ting prices, a cost-reducing firm raises consumer surplus and so may 
raise welfare even when its investment does not break even. But it also 
steals sales and profits from cost-reducing firms in other sectors in 
order to recoup its fixed cost and so may reduce welfare even when its 
own investment is profitable (Mankiw and Whinston 1986). The inter¬ 
play of these two opposing effects can lead to either too little or too 
much investment by potential cost-reducing .firms. 
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In the remainder of the paper, we show that, under uncertainty, 
the profit-maximizing investment decisions of firms are not con¬ 
strained-efficient even when cost-reducing firms do not cut prices. 
Specifically, if a firm’s profit across states averages to zero, its average 
spillover effect on other firms is in general positive. 


III. The Incomplete-Information Model 


Suppose now that there are two states of the world, characterized by 
different distributions of required outlays across sectors. In the good 
state, the distribution is G(F); in the bad state, it is B(F). Assume that 
the densities g(F) and b(F) are strictly positive and continuous on 
[Fmint F max] and that the likelihood rado b(F)/g(F) is stricdy increasing 
in F on that interval. That is, the reladve likelihood of a higher fixed 
cost is higher in the bad state. This implies, in pardcular, that G(F) > 
B(F) for all F in (F min , F max ). 

The probability that the state is good is denoted by p\ it is a common 
prior to all market pardcipants. In this section, each potential monop¬ 
olist also observes his own required outlay F but does not know which 
state is realized. For this reason, he must form a posterior belief, q{F), 
that the state is good: 


q(F) = 


Pg(F) 

pg(F) + (1 - p)b(F) ’ 


(7) 


Because the likelihood rado b(F)/g(F) is assumed to be increasing in F, 
q(F) is decreasing in F for any prior p. The higher is the required 
outlay that a firm draws, the lower is the probability it attaches to the 
outcome of a good state. 

When a firm conjectures that income is y g in the good state and y b in 
the bad state, it invests provided 

a {q(F)y g + [1 - ?(F)]y*} - F S: 0. (8) 

Because profits in each state are linear in income, all that a firm cares 
about in its investment decision is the average level of income it ex¬ 
pects. 

A rational expectations equilibrium is defined as a cutoff required 
outlay F* of the marginal firm, and incomes in the good and bad 
states y^F*) and yb(F*) given by (5) using G(F) and B(F) respectively, 
such that the marginal firm expects to break even. To a firm with 
required outlay F, expected income is 

y’(F) = q{F)yg(F*) + [1 - q(F)\y„(F*). (9) 

Since q(F) is decreasing in F, /(F) is decreasing in F, and therefore 
firms with required outlays below F* always prefer to invest whenever 
the marginal firm expects to break even. In equilibrium, all agents 
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agree on F*, and hence on incomes in the two states, but disagree on 
their relative likelihoods. The marginal firm must expect to break 
even using its own assessment of the probability that the state is good. 

Proposition 2. Under incomplete information, there always exists 
at least one equilibrium. As long as not all firms invest in equilibrium, 
investment by some group of firms with required outlays above F* 
raises expected income. If there are multiple equilibria, the equilib¬ 
rium with the highest F* is Pareto preferred to the others. 

Proof, (a) Existence: Consider the function E[tt(F\F * = f)] under 
the assumption that £['rr(F min |/'* = F mi „)] > 0 and apply the inter¬ 
mediate value theorem. { b ) Underinvestment: We show that, for any 
equilibrium cutoff level F*, there exists an e > 0 such that investment 
by the firms in the interval (F*,F* + e) raises expected income. We 
have 


dE(y) p • g(F*) • rt K (F*) (!-/>)• b(F*) ■ ir,(F*) 

de 1 ’ 1 - aG{F*) 1 - aB(F*) 


Note that pgiF*)Tt g (F*) + (1 — p)b(F*)it^F*) — 0. However, G(F*) > 
B(F*), and since ir^F*) > 0 and ita(F*) < 0, we conclude that dE{y)/dt 
(« * 0) > 0. (c) Pareto ranking of equilibria: Let F* and F* be the 
cutoff levels for two different equilibria with F* < F*. Let A£(y) be 
the difference of expected incomes in the F * and F* equilibria; we 
show that AE(y) is positive. We have 


A E{y) * 


P C* ifgoodiFldGfF) 
1 - oG(Ff) ~ 


[ n 

(1 - p) \ F , ^ d (F)dB{F) 
1 - aB(Ff) ' 


where TT ga od(F) and ir bad (/') are based on investment by all firms with 
required outlays less than F*. But we must have that 


CF\ 

P L. "good (P)dG(F) 

JF i 


+ (i - p) J F . (Fmn > o 


or else the firms between F* and Ft would not be investing in the Ft 
equilibrium. Since profits are positive in the good state and G(F*) > 
B(F*), it follows that &E(y) is positive. Q.E.D. 

The logic of the underinvestment result warrants some elaboration. 
Since more firms invest in the good state, the positive profit that the 
marginal firm earns in that state spills over onto more investing firms 
than the negative profit does in the bad state. Put differently, the 
multiplier on the marginal firm’s profit is higher in the good state. As 
a result, even when the marginal firms expect to earn zero, the ex¬ 
pected change in income from investing is positive. 

Alternatively, consider the interpretation with change in produc¬ 
tive labor. In the good state, the labor that » freed up and spread 
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around as a result of the investment by the marginal firm goes to a 
large extent into the already investing sectors, where the marginal 
product of that labor is high. In this state, a fraction G(F*) of that 
labor has high productivity. In the bad state, when productive labor is 
withdrawn from the economy as a result of investment by the mar¬ 
ginal firm, only B(F*) of the sectors have invested to get high produc¬ 
tivity. In other words, the labor released by the marginal firm in the 
good state is more productive than the labor absorbed by it in the bad 
state. 

The difference between productivity of labor across states is not 
internalized by investing firms, however. Recall that an investing 
firm's profit is equal to the expected amount saved on its sector’s wage 
bill from switching to the low-marginal-cost technology (at an initial 
cost of F) instead of leaving production to the fringe. But the value to 
the economy of the labor saved is equal to the wage payments to that 
labor plus the profit that labor produces elsewhere. In our model, the 
wage is constant, and only the profit component of the value of labor 
saved varies across states. Firms ignore variation in this profit compo¬ 
nent when making their investment decisions. Since there are more 
sectors using labor to produce profits in good times, the profit of the 
marginal firm understates the true value of labor saved in good times 
more than it understates the value of extra labor used in bad times. 
On average, investment by a firm with zero expected profit raises the 
productive labor available to the economy and is therefore preferred 
by the planner. 2 

This underinvestment property of the model relies crucially on two 
assumptions. First, it depends on the marginal cost schedule not being 
too steeply upward sloping. With U-shaped average cost curves, a 
given cost-Feducing firm could have higher total profits but a lower 
marginal payoff from an additional unit of demand in good times 
than in bad times since in good times the marginal cost of producing 
the last unit of output might be much higher. This effect could out¬ 
weigh the effect on the multiplier of a greater number of firms having 
had lowered costs in good times. The multiplier would then be lower 
in good times, and the level of investment would be excessive from 
the planner's point of view. 

Second, we have assumed away the problem of uncaptured surplus 
when cost-reducing firms cut their prices significandy below those of 
the competidve fringe. If an investing firm cuts prices, it creates con- 

* The underinvestment result is generated by a combination of imperfect competi- 
“° n an d aggregate demand spillovers in this economy since we have made sure that the 
beliefs of the phit'tier and of die marginal firm are the same. Ignorance about the state 
of nature is not, therefore, the only source of investment inefficiency in the modei since 
the equally well-informed planner would have more firms investing. 
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Sumer surplus but may also reduce producer surplus of firms in other 
sectors by stealing their sales. As we mentioned above, the net result is 
that investment can be either too low or too high from the social point 
of view. We treat the special case of inelastic demand in order to 
concentrate on the issue of aggregate demand externalities. 

The underinvestment property of the model naturally leads to a 
multiplicity of equilibria in many cases. Multiplicity arises in the 
model when there is a group of jyfarginal firms whose members make 
positive profits from investing if and only if (at least some) other 
members of the group invest. This situation occurs for a wide set of 
parameters, primarily because of the underinvestment property of 
the model. 

Suppose that we are at an equilibrium in which all firms having 
required outlays below F* invest. A firm with required outlay F* + « 
will make a small negative profit if it invests by itself. On the other 
hand, if an entire interval of firms with required outlays slightly above 
F* invest, they will have a potentially large positive effect on average 
income, possibly making the decision to invest profitable for all firms 
in that interval. This would mean that there must be another equilib¬ 
rium in which these firms invest. Hence, the property of the model 
that investment by a group of marginal firms raises income can be 
seen to lead to the existence of multiple equilibria. 

Because this bootstrapping property relies on there being different 
multipliers across states (and higher multipliers in good states), it is 
easy to see why we cannot get multiple equilibria in the one-period 
certainty model. But one could get the underinvestment property and 
the existence of multiple equilibria even in a world of certainty if 
investments generated more than one period of cash flows. In decid¬ 
ing whether or not to invest, firms would look at a discounted sum of 
cash flows, while the social planner would look at the same sum except 
with each period’s cash flow weighted by the aggregate income multi¬ 
plier for that period. Profit-maximizing firms would ignore variation 
in these multipliers across periods and might underinvest (overinvest) 
if their highest profits occurred in the periods in which their spillover 
effects were largest (smallest). 

The idea of inefficient investment due to variation of aggregate 
income multipliers over time can be applied to rapidly developing 
economies. Suppose that firms must incur the cost of a modern plant 
today but reap the profits only in future periods. Their investment 
therefore absorbs current labor and releases future labor. If the econ¬ 
omy is progressing, then it is probable that today’s labor has less 
productive alternative uses than future labor. But if productivity 
gains are mostly confined to a subset of imperfectly competitive in¬ 
dustries, those gains may not be reflected in either lower prices or 
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higher wages. This means that a larger portion of the value of future 
labor saved from investing in a modern plant than of current labor 
used to build it is accounted for by the profits of other sectors that the 
investing firm does not internalize. As a result, the firm’s profitability 
calculation would place a lower relative value on future labor than a 
social planner’s would. It might therefore choose not to invest even 
though it is socially optimal for it to do so. In other words, the inter¬ 
nally generated level of investment in a rapidly developing economy 
characterized by rising aggregate income multipliers (profit spill¬ 
overs) will be too low relative to the second-best optimum constrained 
by monopoly pricing in cost-reducing sectors (Murphy, Shleifer, and 
Vishny 1987). 
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Micro Data 
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One of the oldest questions in macroeconomics concerns the correla¬ 
tion between the business cycle and the real wage. We provide new 
evidence on this question by examining the possible bias that arises 
when (1) workers have unobserved characteristics that affect their 
wages and (2) those workers who move in and out of the work force 
over the cycle have unobserved characteristics systematically differ¬ 
ent from those who stay in. We distinguish as well between the bias 
that arises from those unobserved characteristics that are permanent 
components of wages and those that are transitory. We utilize micro, 
panel data, and maximum likelihood selectivity bias techniques to 
estimate both the extent of this selectivity-cum-aggregation bias and 
the true effect of the cyde on real wages. We find that selectivity bias 
is present: workers are more likely to lose employment during a 
recession if they have high wages, especially if they have a high 
transitory wage component. Overall, the effect of selectivity is to bias 
ordinary least squares estimates based only on workers in a procy¬ 
clical direction. Our results show that the true effect of the cycle on 
wages is still procyclical but much smaller in magnitude than previ¬ 
ous estimates using micro data have suggested. 


We would like to thank Steven Allen, Joseph Altonji, Mark Bils, Herschel Grossman. 
Alan Harrison, Thomas Kniesner, Tony Lancaster, Marilyn Manser, Doug Young, amt 
the partkipants of seminars at Brown University, Columbia University, the Massachu¬ 
setts Institute of Technology, the University of Chicago, the University of Iowa, and the 
University of Wisconsin for comments. 

Uxmal tf PttUeel tummy, 1988, vol. 96, no. 6] 

5 1998 Ir The Uiriveruty of Cbka«o. ASrigbu reserved. 00«-S898/88/980&0007»OI.50 



18 33 


real wages 
I. Introduction 

Most theories in economics, particularly in macroeconomics, are de¬ 
veloped to explain some received “stylized fact” or set of such “facts.” 
In the standard method of social science research, different theories 
are developed to explain these facts—theories that, preferably, can be 
distinguished empirically because they have different additional im¬ 
plications that can be tested. Thus entire bodies of literature, both 
theoretical and empirical, often develop to explain an original set of 
facts. Unfortunately, the received stylized facts are often in serious 
dispute themselves, throwing into question the value of those bodies 
of literature. 

Perhaps nowhere in economics is this problem better illustrated 
than in die study of the behavior of real wage rates over the business 
cycle. Keynes (1936) believed that the patterns of real wages and 
employment over the cycle represent movements along a fixed short- 
run labor demand schedule. In this he agreed with the classical writ¬ 
ers. The implied prediction of countercyclical wage movements set 
off a cycle of empirical testing and new theorizing that has persisted 
to this day. While Dunlop (1938) and Tarshis (1939) are generally 
interpreted as having found evidence supporting non-Keynesian, 
procyclical real wage behavior, Bodkin (1969) found that an acyclic 
real wage could not be rejected by the data. 1 The well-known disequi¬ 
librium macro model of Barro and Grossman (1971) was designed, in 
part, to explain why the real wage could be either countercyclical, 
procyclical, or acyclic; in short, the real wage in the model bears no 
definite relationship to employment at all. Contract theory, which in 
its simplest forms predicts fixed real wages, may also be viewed in part 
as a response to the empirical evidence (see Rosen [1985] for a recent 
survey). Subsequently, the picture has become more muddied. Neftci 
(1978) found new evidence supporting countercyclical real wage be¬ 
havior, thus reinvigorating the fixed labor demand curve interpreta¬ 
tion of the business cycle, and Sargent (1978) argued that the evi¬ 
dence is consistent with an economy in which there is movement 
along a labor demand curve subject to dynamic costs of adjustment. 
But Geary and Kennan (1982) found that the evidence of Sargent and 
Neftci essentially disappears when a different price index is employed 
and when a longer time period is considered. A number of recent 
studies using micro, panel data have also been conducted and have all 

1 However, Coleman (1984) has pointed out that Dunlop and Tarshis examined only 
*he correlation between real wages and money wages, assuming the latter to be procy- 
c “ c *l- A direct examination of the Dunlop-Tarshis data actually shows some evidence 
, countercyclical real wage movement, which was even noted by Tarshis. See Coleman 
(P-Wff.) for a discussion. 
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found procyclical wage behavior (Raisian 1983; Coleman 1984; Bil 
1985; Hashimoto and Raisian 1985). 

In this paper we continue the search for an empirical resolution c 
this important issue. Our interest is in determining whether an 1 
biases can be detected in prior studies that might help explain thei 
differences. Our major focus is on the aggregation bias that may arisi 
when wages of only working individuals are followed over the cycle 
The potential for bias arises if the population is heterogeneous ii 
some attribute affecting wages, an attribute that may be intrinsically 
unobservable. One such source of heterogeneity is skill level, arisin 
either because the economy is composed of a set of sectors withii 
which labor is homogeneous but across which labor is heterogeneous 
or because every individual has a different marginal product (i.e. 
each individual is his own “sector"). Heterogeneity may also arisi 
from compensating wage differentials for risk of layoff that vary 
across workers. In any case, from whatever source heterogeneity 
arises, aggregation bias will result if the types of individuals who movi 
into and out of the work force over the cycle are not randomly choset 
from the population with respect to such heterogeneous attribute: 
for this will cause the economywide mean wage to move simply be 
cause the composition of the work force changes. Indeed, the meat 
wage will move even if the wage within each homogeneous sector oi 
for each continuously employed worker is constant over the cycle 
The wage will be biased countercydically (procydically) if relatively 
low-wage (high-wage) individuals or sectors are more cyclically sensi 
tive. Clearly, what is needed is some type of fixed-weight wage inde> 
that is not affected by such compositional changes. 

The aggregation bias in the economywide mean wage has beer 
noted previously by Stockman (1983), Coleman (1984), Bils (1985) 
and Hashimoto and Raisian (1985). All these studies employ micro 
panel data and hence control for the extra observed characteristic: 
measurable in such data. However, none of them corrects for the 
aggregation bias that results from missing wages for nonworkers, noi 
does any study establish the direction of bias from this source. In 
stead, the studies employ various random effects and fixed effects 
estimators on workers alone without correcting for selection bias. 
Their findings of procyclical wage behavior may thus be in part a 
result of such bias. Indeed, our estimates show that the bias is in a 
procyclical direction. 8 In addition, our results indicate that controlling 


* Bits (1989) does test a selection bias correction, but hit results do not allow him t< 
reject the selection bias hypothesis (see below). The aggregation bias has also beei 
noted by Heckman and Sediacek (1985), although in the context of an equiiibriur 
model and not from an examination of the unemployment rate. In addition to provk' 
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for the extra observed characteristics in micro data without control¬ 
ling for selection bias actually worsens this procyclical bias. 

We correct for aggregation bias by using the self-selection correc¬ 
tion technique of Heckman (1974). The technique inherently re¬ 
quires micro (i.e., individual) data because it explicitly estimates the 
correlation between the wage rate of an individual and the probability 
that he is employed. This estimate is the indicator of the presence or 
absence of self-selection and aggregation bias. Implicitly, this proce¬ 
dure imputes a wage to nonworkers and hence allows us to include 
them in a fixed-weight, or composition-constant, wage index whose 
movement over the cycle will reflect “true” wage movements. Al¬ 
though a single cross section of micro data would be sufficient to 
estimate the correlation between wage rates and employment proba¬ 
bilities, we employ panel data instead in order to improve the 
efficiency of the estimates (as well as for other reasons given in the 
paper). We estimate both random effects and fixed effects models 
and conduct specification tests to determine their robustness and to 
choose between them. Our results show that the aggregation bias is in 
a procyclical direction but that a correct real wage index nevertheless 
continues to follow a mildly procyclical pattern after correction. 

The paper is organized as follows. In Section II we discuss a statisti¬ 
cal model in which heterogeneity and self-selection are represented 
and we present our estimation methods. In Section III we present our 
data and central results. Section IV contains further findings on the 
relation of our results to those of aggregate studies. A conclusion is 
given in Section V. 

II. A Statistical Model of Heterogeneity and 
Self-Selection over the Business Cycle 

A. General Issues 

Virtually all macro theories (Keynesian, contract, etc.) assume for 
simplicity that the labor market is homogeneous. Consequently, those 
laid off and rehired over the business cycle are randomly chosen and, 
moreover, the wage rates of all those who continue to work are identi¬ 
cal. Therefore, empirically, the fact that no wage rates can be ob¬ 
served for nonworkers creates no bias in the estimates of the move¬ 
ment of the mean wage computed over the sample of workers only. 

“*8 a direct examination of business cycle effects, our study differs from that of Heck- 
i ****** and Sedlacek by our use of panel data, which can improve efficiency and possibly 
; rR ' wce asymptotic bias. Chirinko (1980) also examined the compositional shift issue by 
• constructing an aggregate wage series with fixed industry employment weights. He 
ignoring industrial composition shifts over the cyde produces a procyclical 
"•>», as we find with individual-level data. 



I 836 JOURNAL OF POLITICAL ECONOMY 

Problems arise if the labor market is heterogeneous. For example, 
if there are different sectors of the economy with different labor 
demand elasticities, a general Keynesian deflation and consequent 
real wage increase will induce greater layoffs in the high-elasticity 
sectors. If mean wages also differ across sectors, the change in the 
relative proportion of workers in each sector will generate a move¬ 
ment in the economywide mean wage taken over those still working. 
This movement is independent of the initial deflation-induced wage 
increase. Thus if high-wage manufacturing were more cyclically sen¬ 
sitive than other sectors of the economy, the compositional effect 
would push the mean economywide wage downward as unemploy¬ 
ment increases. We regard such composition-induced movements in 
the wage as inducing bias in the estimate of the composition-constant 
wage with which most macro theories are concerned. 

Put this way, it is clear that the difficulty with aggregate wages arises 
from a standard index number problem, for a test of any theory of 
the cycle requires that we construct a wage index that is a weighted 
average of the wages in different sectors but that holds those weights 
fixed over the cycle. This notion has been long recognized for shifts in 
industrial composition over the cycle, but we extend it by counting the 
nonmarket sector (i.e., the state of unemployment) as a sector as well. 
By constructing a wage index that includes those who leave employ¬ 
ment during downturns and who reenter during upturns and that 
properly imputes a wage to such individuals, we can obtain a fixed- 
weight wage index whose movements over the cycle are not biased by 
changes in the composition of the work force. 

Because most macro theories assume homogeneous workers, there 
have been few formal models worked out that would allow us to 
generate priors on whether high-skill or low-skill workers are more 
likely to leave the work force during downturns (see Grossman [1978] 
for an exception involving seniority and risk shifting). The most com¬ 
mon presumption is that low-skill workers are the first to be laid off, 
though there are few theories that explain this other than those in¬ 
volving union seniority rules and those with specific human capital 
combined with nominal wage rigidities. It is possible that in matching 
models such as that of Jovanovic (1979), which incorporate matching 
heterogeneity as well as worker heterogeneity explicitly, lower-rent 
matches may break up before higher-rent matches. Alternatively, in 
the standard job search model those with higher offered mean wages 
spend less time in search, leading to higher employment rates fo> 
high-wage workers (Burdett and Ondrich 1985; Mortensen 1986).' 
However, those with higher wages may also have higher asset level 

* This resuit requires log concavity of the wage offer density. 
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and hence higher reservation wages, leading to lower employment 
rates. Put more generally, those with higher wages may also have 
higher values of nonmarket time. 4 

Besides these theories of heterogeneity in skills or in other worker 
characteristics, there are also hedonic theories relating wage levels to 
the risk of layoff. The general hedonic model of Rosen (1974) makes 
this type of heterogeneity explicit, and there is some evidence that 
compensating wage differentials for differential risk of layoff are 
present in the labor market (Abowd and Ashenfelter 1981). The 
presumption here would be that high-wage workers are more likely to 
be laid off, other things equal. 

In any case, in our work we shall estimate a reduced-form model in 
which heterogeneity is allowed but the sign of the correlation between 
wage levels and employment probabilities is left free to be determined 
empirically. We shall also adopt a general framework in which every 
member of the population has a different wage—every individual is 
his own “sector.” This assumption will generate a wage distribution 
that can be treated, for convenience, as continuous. In this general 
framework, bias arises if the probability of employment differs across 
the wage distribution. If it does, the “composition” of the work 
force—that is, the relative proportions of workers of different 
types—changes over the cycle. 

Heterogeneity of any type can be observed or unobserved. In the 
former case it is by definition measurable by variables in the data, 
such as personal characteristics (e.g., education or age) or market 
characteristics (industry or occupation). Unobserved heterogeneity is 
the residual and, by definition, ends up in the error term of the 
model. The division of heterogeneity into observed and unobserved 
components depends on the data set utilized, and one advantage of 
individual data over aggregate data is that more observables are gen¬ 
erally available and hence can be controlled for. We shall provide 
evidence below on the degree to which the availability of more observ¬ 
ables reduces aggregation and self-selection bias in the estimated ef¬ 
fect of the cycle on real wages. However, we shall assume that there 
may be omitted variables even in micro data that end up in the error 
term, generating unobserved heterogeneity as well. It is only the pres¬ 
ence of unobserved heterogeneity that generates the need for the 
selection bias techniques that we employ. 

Our statistical model is a simple reduced-form representation of 
that of Heckman (1974), which is by now well known in the selectivity 
bias literature. Assuming that we have a panel data set on N individ- 

A 

The only fully worked out equilibrium model of heterogeneous skills in the tabor 
"tarket is that of Heckman and Sedlacek (1985). 
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uals (i = I,..., AT), each observed for T, time periods (t = 1.T ; ), 

we write the model as 


In W u - X*0 + e* 


observed if £„ = 1, 


=* Z^-y + v*. 



if £* & 0 
if£*< 0, 


0 ) 

( 2 ) 

(3) 


where W* is the hourly wage rate of individual i at time t, Xu is a row 
vector of regressors, 0 is its associated coefficient vector, E u is a 
dummy variable for whether individual t is employed at time /, and Z u 
and y are a regressor row vector and coefficient vector, respectively, 
affecting the probability of employment. The error terms in the 
model are c* and v*. The employment equation is set up in standard 
fashion as a binary choice model in which a latent index, E%, deter¬ 
mines a dichotomous indicator for employment. This leaves the error 
v* free to vary from plus to minus infinity. The aggregate unemploy¬ 
ment rate is included in the X l( and Z* vectors. 

The source of potential bias in estimating equation (1) with ordi¬ 
nary least squares (OLS) on workers alone can be easily seen, for the 
expected value of the log wage in the employed subsample is 

£<ln W*| E u * 1) - X„0 + £(e*|£* = 1) 

(4) 

“ X*0 + £(e*|vtfSt -Zirtf). 

Thus we see that if e* and v* are correlated, the mean of the error 
term in (1) will be a function of the Z& in (2). Consequently, if the 
elements of X„ are correlated with those of Zu at the same time that e« 
and v* are correlated, specification bias will affect the OLS estimates 
of0. 

If €u and v u follow a bivariate normal distribution with correlation p 
and with respective variances cr, and l, a more explicit expression can 
be obtained: 


£(ln Wa!£a - 1) - X*0 + pa,\*. (5) 

where 

■ £<va|v* & -Z*y) * y - )* (6) 

where / and F are the unit normal density and distribution functions, 
respectively. The case of p > 0 is illustrated in figure 1, with X« and Z* 
held fixed. If all wages were observed, the distribution of In would 
be normal with mean pt. If p > 0, selecting out nonworkers eliminates 
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—— Without Selection 

-With Selection 



Fig. 1.—Wage density with and without selection assuming p = 0 (X„0 = p. X* ■= X) 

more low-wage than high-wage observations, generating the dotted 
frequency distribution with mean p. + par t X. 

The aggregation-cum-selection bias with which we are concerned 
arises because a change in the unemployment rate will not only shift 
the mean of the distribution (p.) but in general will also shift A.. For 
example, suppose that the unemployment rate (U) is the Ath indepen¬ 
dent variable in both the wage and employment equations. Then 

3£(ln W u \Ei, = 1) _ „ . 3\* 

- Hi - p ‘ + JU 

. (7) 

= P* - 

where m„ is a positive number.® Hence the shift in the mean wage of 
workers will not equal (D* unless p = 0 or y* = 0. However, we clearly 
expea y* < 0. Hence the wage effect will be countercyclically (procy- 
clically) biased if p is positive (negative). For example, if p > 0, then in 
a recession a disproportionate number of low-wage workers would 
lose their jobs. This would lead to an upward drift in the mean wage 
that could be mistakenly taken as evidence of countercyclical wage 
behavior. 

The object of our empirical analysis is to obtain consistent estimates 
of and, in the process, of the effect of the unemployment rate 
(Pa). The function X tl fi is the fixed-weight index we are seeking, and 

* Taking the partial derivative of (6) with respect to V and recognizing that 
*f » -■%,&/(*)/!>* “ -/(*)*, and dF(*)ldx «* /(*), we see that w* ■ ML* + 

*#T). But since k a is the mean of a distribution truncated from below by - Zuy (see eq, 
lol). An > -Z^y and hence «« > 0. 
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its movements over the cycle (i.e., changes in p. in fig. 1) are the object 
of interest. Because theories of the cycle aim to explain shifts in the 
mean of the untruncated distribution in figure 1—not the effect, for 
example, of merely lopping off some workers from the distribution— 
it is cyclical movements in a fixed-weight wage index like X*£ that 
they purport to predict. Hence, the proper testing of these theories 
requires the construction of such an index. 

Consistent and efficient estimates of the unknown parameters of 
the model— ft, y, p, and <r,—can be obtained by maximum likeli¬ 
hood. 6 The critical parameter p, whose estimate allows us to disen¬ 
tangle “true” from “spurious" movements in the wage, is estimated 
from the cross-equation correlation between wage equation residuals 
of workers and their “residuals” in the employment equation. It 
should be noted that this model could be estimated just as easily on a 
single cross section of micro data, and an estimate of p obtained. It is 
inherently the cross-sectional dimension of the data rather than the 
time-series dimension that allows an estimate of p to be obtained and 
the aggregation bias problem to be solved. This raises the question of 
why panel data are desirable at all, a question to which we now turn. 

B. Econometric Models for Panel Data 

The availability of panel data may appear to make these complicated 
estimation procedures completely unnecessary. As originally stated, 
the bias problem arises because the composition of the work force 
changes over the cycle as individuals with differing wage rates drop 
out and reenter. It has therefore been argued (e.g., Bils 1985) that 
panel data can be used simply to track individuals who stay employed 
at multiple points over the business cycle, thus eliminating the compo¬ 
sition problem. 7 

* An alternative two-stage procedure developed by Heckman (1979) could also be 
used. Though the resulting estimates are consistent, they are not efficient; therefore, 
we use full-information maximum likelihood. 

7 Another possible use of panel data is to determine the sign of the selectivity bias by 
examining the relative level of past wages of those who are laid off during a recession. 
For example, one could estimate an employment status or layoff equation with the 
lagged wage as a regressor. Unfortunately, aside from selectivity bias problems that 
might be created because not all individuals will have a wage in any given past period, 
the coefficient on the lagged wage will not necessarily correctly measure the sign of the 
correlation between the current (latent) wage and the probability of current employ¬ 
ment. In fact, in the simple permanent-transitory setup we present in this section, the 
coefficient on the lagged wage can be shown to take the sign of the correlation between 
the permanent components of the wage and employment equation errors. As our 
empirical results will show, this sign is the opposite of the contemporaneous wage- 
employment error correlation because transitory wage shocks have a strong correlation 
with the current-period employment equation error, which is opposite in sign to that o 
the permanent wage equation error component. 
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Unfortunately, such a procedure will eliminate the bias only under 
restrictive conditions and may instead increase the bias. Intuitively, 
the difficulty is that one would prefer to use panel data to track the 
wages of single individuals over time, including their latent, or poten¬ 
tial, wages in nonworking periods. This would obviously eliminate the 
composition problem. However, selecting only those periods during 
which the individual is employed results in a systematic selection of 
wages that have transitory components different from those of wages 
in nonworking periods, as we shall now show. 

To illustrate first how the bias may be eliminated, assume that the 
error term in the wage equation is composed of a permanent, time- 
invariant, individual-specific term plus a transitory term: €* = m + 
Ha. Assume that p.; ~ N( 0, a*), ~ N( 0, a*), and the two compo¬ 
nents are independent. Now suppose that we examine a group of 
individuals working at t and t + 1 and that we difference the wage 
equation. The expected wage change will be 

£(ln W ,, <+ 1 - lnW*|£,, + 1 = 1 ,£* = 1) 

~ 1 + £(ti«,«+i — ti,f|£j , <+ 1 = l,£(t ~ 1) (8) 

= (X,.,+ i — Xi,)p + £(t, u+1 ^li<|v li i+, — iY, v„ s — Z^y). 

From (8) we can see that bias is eliminated if the differenced transi¬ 
tory error 1 - tfa) is independent of and (hence) uncorrelated 
with the probability that the individual is working both periods. If \i, is 
correlated with the latter probability, it causes no difficulty because it 
is differenced out. But bias remains if the second term in (8) is non¬ 
zero, that is, if the change in the transitory error is correlated with the 
probability of moving into or out of the work force and, hence, with 
the probability of staying in the work force both periods. 

As an example, suppose that ifc, and v* are positively correlated but 
that the true wage is acyclic (0* = 0). Then an increase in the unem¬ 
ployment rate from one period to a second will have no effect on the 
true mean of the wage distribution. But because those with transitor¬ 
ily low wages in the second period (i.e., low draws of 1 ^./+ 1 ) are more 
likely to drop out of the work force than those with transitorily high 
wages (with the permanent component held constant), a dispropor¬ 
tionate number of those remaining in a continuous-worker sample 
will have wage increases rather than decreases. Thus although the 
transitory wage increases and decreases net out to zero in the total 
population, the increases will predominate in the sample of those still 
working in the second period. Consequently, a countercyclical bias 
will be imparted to the wage-unemployment relationship. Reasoning 
analogously, it can be seen that a negative correlation between n* and 

would lead to a procyclical bias- Thus the relationship between the 
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direction of bias and the sign of the correlation is identical to that 
discussed in the cross-sectional model in the previous section. 6 

The bias in this differencing procedure could be worse than if one 
were simply to compare the mean wages of workers separately in the 
two periods. Assuming for illustration that v u is serially uncorrelated, 
(8) can be written as 

£(ln W iil+l - In W*|v u+l > -Z, J+x y, v u as -Z„y) 

** (Xt.t+i — + a^ v (X., ><+ i — ka), 

where is the covariance between i\jt and v* and \ lt is as defined in 
(6). If instead we were simply to compare the mean wages of those 
employed at each point in time, as in (5), and to difference across 
periods, we would have 

£(ln W u+ ,|v* a: -Z iJ+l y) - £(ln W„|v* a - Z,,y) 

( 10 ) 

= (X M+ 1 X U )P + ((Tjiv *f U'nv)(ki.<+1 — kit), 

where a^ v represents the covariance between p,, and v„. The relation 
between the biases in (9) and (10) is indeterminate. The bias in (9) 
would be worse than that in (10) if jo\, v | > (o^ + cr T , v |, which would be 
the case if tr vv < 0 and > 0 but smaller in absolute value than tr^ v . 9 

This discussion should make clear that the advantages of panel 
data, as well as their proper use, are not as transparent as they might 
first appear. There are in fact only three advantages of panel data 
over other forms of micro data such as a single cross section or a series 
of repeated, independent cross sections. First, relative to a single cross 
section, panel data allow direct estimation of the effects of changes in 
the aggregate unemployment rate. In a single cross section, either a 
cross-sectionally defined unemployment rate (e.g., an area rate) or 
none at all must be used. However, the estimate of p obtained in a 
single cross section could be used to adjust aggregate conditional 
wage data, so this advantage of panel data is not decisive. 10 Second, 
relative to single and repeated cross-section data, panel data can im- 

* In economic terms, the sign of the correlation tells us whether those with high or 
low transitory wages are more or less likely to be laid off in a recession. There are fe* 
theories to guide our priors on the sign of this correlation, though the theory of 
nominal contracts of Fischer (1977) would generate a negative correlation. 

* Although most studies that first-difference micro data ignore this problem, Bib 
(1985) performs a selection bias correction to (9) in his app. B. However, the 
inefficiency of his correction method (a result of both the method and the drasifc 
reduction in sample size by which he implements h) leads to high standard errors and 
an inability to reliably detect the presence and importance of selection bias. See Nelsor 
(1984) for a discussion of the inefficiency of the two-step estimator. 

10 This would require an estimate of the effect of the aggregate unemployment r»‘< 
on X. However, given this, die term por.X could be subtracted from aggregate condi 
tional wage means, and the resulting wage series could be used in an aggregate analym 



REAL wages 


ls 43 

prove the efficiency of the estimates if the error terms in the model 
are serially correlated, for example, if individual effects such as u, are 
present. Third, if the individual-specific effect in the error term is 
correlated with the regressors, panel data may reduce the asymptotic 
bias obtained with single or repeated cross sections. 11 

The efficiency improvement is illustrated most clearly in the well- 
known random effects model, which we shall estimate. In the pres¬ 
ence of individual effects, estimates from this model would be more 
efficient than those from a set of repeated and independent cross 
sections, though both would be consistent. Assuming individual- 
specific effects in the error terms of both wage and employment equa¬ 
tions, we have the error structure 

e* = |i, + I),,, (11) 


v„ = iji, + 


( 12 ) 


We assume that 


(M-i »l»i y\u <*u) ~ N( 0, ft). 


(13) 


where 


ft = 


where 


o 
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t 

* b 


Pi<r 2 Ps W* 0 0 

© 

© 

b 


P2 0 0 

<T$ CT^ 


<4 P4<r„a w 

a 2 


oi 


Pi 


2 2 

u (Xifcilf 

= _ p 2 = = ps « p„ = 

ai cr* 


(14) 


(15) 


We further assume that n* and <o„ are white-noise errors and that all 
serial correlation arises from the fixity of the individual-specific ef¬ 
fects. 12 The parameters pi and p$ represent the autocorrelation 
coefficients for e« and v u , respectively, and also equal the fraction of 
the tu and v* variances accounted for by the permanent effect. We 
I allow the cross-equation correlation between the permanent effects to 
; he ps and that between the transitory effects to be p«. The likelihood 
| function is shown in Appendix A. 


In addition, as we will discuss below in the context of our results, it appears as 
~ ow ®b the richer error specification allowed by panel data may reduce the specification 
“ias generated by general parameter heterogeneity. However, we have no formal proof 
■hat this is to be expected in general. 

. We make this assumption entirely for computational reasons, for the numerical 
■ntegrations to be discussed in App. A would be intractable otherwise. 
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An alternative model, also widely known in the panel data litera¬ 
ture, is the fixed effects model, Here the individual effects ft and ^ 
are treated as fixed constants for each individual, constants that must 
be implicitly or explicitly estimated as part of the procedure. The 
advantage of this treatment of the error terms is that, unlike the 
random effects and OLS models, consistency of the coefficient esti¬ 
mates may be retained even when the individual-specific terms are 
correlated with the regressors in the equation. However, no fixed 
effects can be estimated for those who are always employed or always 
unemployed, so they must be excluded from the sample. The proba¬ 
bilities in the likelihood function must be conditioned to account for 
this sample exclusion. The likelihood function is shown in Appendix 
A. 19 

Although the econometric models we have specified are more com¬ 
putationally complex than those in most applied work, many restric¬ 
tions have been imposed to maintain a reasonable level of tractability. 
For example, in our random effects model we have assumed that the 
variances of the various error terms are constant over time. In all our 
models described above, we have assumed some combination of uni¬ 
variate and bivariate normality for the various error terms, though we 
test this assumption in our empirical work. Our approach to gauging 
the importance of these restrictions is to conduct specification tests to 
determine the robustness of our results. Unfortunately, many of the 
specification tests in econometrics are not appropriate for our models 
because residuals of the usual kind cannot be calculated when depen¬ 
dent variables are truncated or censored. We defer discussion of 
which of these tests we apply to subsequent sections. 


III. Data and Central Results 

A. Data 

Our data are taken from the National Longitudinal Survey of Young 
Men (NLS), a nationally representative sample of the male population 
aged 14-24 drawn in 1966. 14 The 5,225 males in the sample were 

,s There is one difficulty with the fixed effects model: Estimates of the fixed effects 
are inconsistent unless T, is “large,” and this inconsistency is transmitted to the 
coefficient estimates in limited dependent and selected bias models (Heckman 1981: 
Chamberlain 1984). Heckman has provided some Monte Carlo evidence indicating shat 
the inconsistency is small if T > 8, so we shall estimate the model since we have a 
maximum of T * 12 in our data (see below). 

M This is the best of the commonly available micro panels for our purposes. Unlike 
the other NLS cohorts, it is simultaneously male and uncontaminated by retirement 
trends, and it is available for a long period. Unlike the Michigan Panel Study on Income 
Dynamics, it contains a consistent measure of the hourly wage rate and hours of work in 
the week of interview for all waves. The Michigan panel has a measure of annual 
earnings, but such data are contaminated by retail bias and are less preferable than 
weekly data in dealing with the selection bias probteyn (see below). 
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interviewed in 12 of the 16 years from 1966 to 1981, and data were 
collected in each year on employment status, wage rates, and various 
sociodemographic characteristics. The sample design was stratified by 
race and other characteristics, so we employ survey weights in all our 
analyses. 15 We restrict the sample in each year to those at least 21 
years old at the interview date, who had completed their schooling 
and military service as of the interview date, and who met a number 
of miscellaneous requirements for complete data. 16 Our exclusions 
are shown in detail in Appendix B. Our final analysis sample contains 
4,439 males and 23,927 person-year observations, implying an aver¬ 
age number of years per person of 5.4. Approximately 88 percent of 
the observations are for periods in which the person is working. To 
economize on computational costs we draw a random half-sample, 
leaving us with 2,219 males and 11,886 person-year observations. For 
the fixed effects model we must exclude the men always employed 
and always unemployed, which leaves us with 723 men and 4,581 
person-year observations. 

Means of the variables used in our central analysis are shown in 
table 1. Our wage rate is an hourly straight-time measure in 1967 
consumer price index dollars; below, we test alternative wage rates, 
such as those including overtime. 17 We should note that, while Bils 
(1985) claims to have included overtime earnings and hours in every 
year, we employ the same data set and do not believe it is possible to 
do so. We should also note that our wage measure is a point-in-time 
measure, taken as of the date of interview, rather than an annual 
measure (e.g., annual earnings divided by annual hours). It may at 
first appear that an annual measure would mosdy eliminate the selec¬ 
tion bias problem because almost all men work at least once during 
the year and hence have at least one wage observation. But bias would 
still be present if the wage fluctuates within the year, for in the pres¬ 
ence of selectivity, the wage would be systematically different in work¬ 
ing weeks than in nonworking weeks. Moreover, even if there is no 
selection bias, regressing an annual wage measure on an annual un- 


8 The normalized survey weights are entered before each Jog probability in the 
likelihood functions. However, in our analysis it turned out that unweighted estimates 
arealmost identical to the weighted ones. 

We impose all screens on a period-specific basis, omitting only those time periods 
°f each cross-sectional unit for which the screening criteria are not met. For example, 
ou 1 I 7 a 8 e restriction generates an increasing sample size with each year. 

For our wage measure we use wage and hours information as of the interview date 
rather than for the year prior to the interview. For workers without an hourly wage 
(i.e., salaried workers), we divide current earnings by current straight-time hours 
(equals total hours minus overtime) when such information is available, and "usual" 
hours of work otherwise. We choose this measure because we desire a pomt-in-time 
w *ge measure rather than an annual average and because overtime questions were not 
asked every year. 
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TABLE 1 

Means of the Variables Used in Central 
Analysis 


Variable 

Mean 

Log wage* 

1.07 

National unemployment rate (£/) b 

6.34 

Education (EDUC) 

12.60 

Years of labor force experience (EXP) C 

7.96 

EXP squared (EXPSQ) 

86.83 

White dummy (WHITE) 

.74 

Number of children (KIDS) 

1.29 

Wife present (WIFE) 

.69 


Not*. —Unweighted means. Sample size * 11.886 (10,510 working 
periods). 

* Workers only, in 1967 consumer price index dollars. Constructed 
as ttraighr-ume hourly wage rate or. if salaried, as current earnings 
divided by actual or usual hours of work per week. 
b AU civilian workers Ifi or older, seasonally adjusted. 
c Constructed as interview date minus the completion date of school¬ 
ing or the mfliiary, whichever was later. 


employment rate can be shown to result in a coefficient biased toward 
zero. 18 Thus the point-in-time measure is actually superior. 

Our specification of the Xu and Z* vectors is based on the assump¬ 
tion that X„ should include only variables that directly affect an indi¬ 
vidual's marginal product and that Z« should include all variables in 
X„ plus additional ones that may affect hours of work and employ¬ 
ment status independent of the wage. That Z u should include all 
variables in X„ follows if it is assumed that the wage rate is a determi¬ 
nant of employment status. We include education, experience, expe¬ 
rience squared, and a race dummy in Xu and additionally the number 
of children and the presence of a wife in Z„. The latter two variables 
are presumed to affect labor supply propensities but not marginal 
products. However, we examine the sensitivity of our results to this 
specification by providing estimates with identical X* and Z* vectors as 
well. We also include the unemployment rate in both vectors. 


B. Central Results 

Table 2 shows the results of OLS and maximum likelihood estima¬ 
tions. The latter estimates are shown with no individual effects, with 
random effects, and with fixed effects. Two specifications of the equa- 

ts No bias is present if the average annual unemployment rate in the regression is 
computed over the working weeks of the individual rather than over all 52 weeks in the 
year. If there is a true cyclical effect but no selectivity, a bias arises with a 52 -week 
average because high-unemployment weeks are more likely to be those for which no 
wage is observed. One can consequently show that the 52-week average will always 
change more in absolute value than a correctly computed average (i.e., one computed 
over working weeks only). Unfortunately, most data sets do not provide information on 
the specific weeks in the year in which the individual does and does not work. 
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tions are presented, one including the observed variables just dis¬ 
cussed and one with no observed covariates other than the unemploy¬ 
ment rate and trend. 

The OLS estimates show evidence of significant procyclical behav¬ 
ior in the wage. The magnitudes of the coefficients imply that a one- 
percentage-point increase in the unemployment rate lowers the wage 
rate by a little less than 1 percent. The addition of the extra regressor 
set results in an increase in the absolute value of the coefficient, imply¬ 
ing that failure to control for observed heterogeneity leads to coun¬ 
tercyclical bias in the pattern of real wage movement. Put differently, 
the implication is that those with observed characteristics corre¬ 
sponding to lower wages (lower education, lower experience, non¬ 
white) are more likely to leave employment in a cyclical downturn. 
Other coefficients in the wage equation have conventional signs and 
magnitudes: education and experience have positive effects, with the 
latter decreasing with additional experience. White workers have 
higher wages than nonwhites. There also appears to have been a small 
but statistically significant upward trend in wages over the period of 
about 0.6 percent per year, with experience held constant. 

The maximum likelihood estimates with no individual effects give 
insignificant unemployment rate coefficients, implying an acyclic 
wage. However, the other coefficients in the wage equation are quite 
close to the OLS estimates. The coefficients in the employment equa¬ 
tion show evidence of the expected procyclical employment behavior 
but negative trends in employment probabilities. The other coeffi¬ 
cients indicate that employment rates are higher for those with more 
education, those with children, and those who are white and married. 

This result implies that the OLS unemployment coefficient is pro- 
cyclically biased, an implication that is consistent with the significantly 
negative estimate of p. That estimate is also quite large in magnitude 
(-.70 to -.75). Thus the prima facie evidence implies that the near 
universal finding of procyclicality in the 1970s in past studies utilizing 
micro data is an artifact of selection bias. 

The estimates from the fixed effects model show, as in the 
no-effects model, insignificant unemployment rate coefficients. The 
implied strong procyclical bias in OLS is a result not only of the 
fixed effects but of the negative correlation of the transitory errors 
(P4 * -.222). 19 

19 Our Axed effects model results in table 2 were obtained front a model with fixed 
effects only in the wage equation. Attempts to indude fixed effects in the employment 
equation were unsuccessful, for they were estimated to be zero. Also, conditioning the 
probabilities in the fixed effects model on the probability of inclusion is absolutely 
necessary in our 8ata. Without such conditioning the unemployment rate coefficient 
cnnges from .0011 to .0012 and is highly insignificant. Likewise, the random effects 
tnodel estimated on the fixed effects sample yields an insignificant unemployment rate 

coefficient of .0021. 
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The random effects estimates tell a different story, for the unem¬ 
ployment rate coefficients are negative and significant in both speci¬ 
fications. The estimates from specification (1) imply no significant 
procyclical selectivity bias in OLS, but this is a result of the exclusion 
of the variables for observed heterogeneity, which, as we noted above, 
induces a countercyclical omitted variable bias. For the preferred 
specification (2), the unemployment rate coefficient of - .0066 is still 
considerably below the corresponding OLS effect of - .0096, an indi¬ 
cation that procyclical selectivity bias is still present but weaker than 
was indicated by the no-effects and fixed effects models. The weaker 
procyclical bias evident in the random effects model is reflected in the 
estimates of the covariance parameters as well. In specification (2), the 
random effects results indicate a negative correlation of the transitory 
errors (p< = - .252) but a positive correlation of the permanent er¬ 
rors (p 5 « .436). The resulting composite correlation is virtually equal 
to zero (p = .029). The covariance estimates also indicate that individ¬ 
ual effects are important in both equations, though more so in the 
wage equation (pi * .608, p 2 = .239). 

Our estimates of p s and p 4 have interesting economic interpreta¬ 
tions. While it appears that those with lower permanent wages and 
presumably lower skill levels are more likely to leave employment in a 
recession, consistent with the conventional view and with those of a 
few economic models, the probability of unemployment is also higher 
for those with wages above their own individual-specific trends (even 
if they have high permanent wages). Thus those having a particularly 
good temporary wage draw appear to be more vulnerable to transi¬ 
tory negative employment shocks than those with a temporarily bad 
wage draw. This would appear to contradict those equilibrium models 
that predict that high transitory wages increase labor supply. 

It is important to note that all our models imply procyclical wage 
movement that is much weaker than that found in several recent 
micro data studies. For example, while table 2 shows that our OLS 
estimates are often only slightly more negative than our random ef¬ 
fects estimates, the first-difference estimates of Bils (1965) are two to 
six times more negative. Moreover, our results tend to indicate that 
the compositional bias is in a procyclical direction, while Bils (who 
controlled for only observed characteristics) found the opposite. 

Nevertheless, the unemployment rate estimates from the three 
maximum likelihood models are considerably different and hence 
imply considerably different degrees of procyclical bias in the OLS 
unemployment rate coefficient. To discriminate between these mod¬ 
els, we conduct a number of tests of specification, including tests of 
the normality assumption. Virtually all these tests favor the random 
effects model and reject the no-effects and fixed effects models. 
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First, it can be seen from table 2 that a likelihood ratio test over¬ 
whelmingly rejects the no-effects model in favor of the random ef¬ 
fects model (xi = 5,210). Apparently the addition of the random 
effects and the extra selection bias parameter for the cross-equation 
correlation of the individual effects greatly increase explanatory 
power. A likelihood ratio test between the random and fixed effects 
models is complicated by the different sample sizes of the two, but 
when the random effects model is randomly sampled down to the 
fixed effects sample size, the value of the maximized log likelihood 
function is - 2,627. This implies that the hypothesis that the random 
and fixed effects models are the same cannot be rejected (xlse - 
472).*° Second, allowing a Box-Cox transformation of the dependent 
variable in the wage equation yields sensible results for the random 
and fixed effects models—the Box-Cox transformation parameters 
(X) were .13 and .07, with unemployment rate coefficients of - .0078 
and -.0025, respectively—but the no-effects maximum likelihood 
model fails to converge at all with this transformation, yielding values 
of p of - 1. 

A more formal test for specification that can be used to discriminate 
between the models is the information matrix test. This is a test for 
general misspecification in maximum likelihood models (White 1982) 
and has been shown to be equivalent to a test for local unobserved 
parameter heterogeneity by Chesher (1984); in particular, it is a test 
for local departures from normality. The details of our construction 
of the information matrix statistics are given in Appendix C. Table 3 
shows the information matrix statistics for the three models, each 
distributed as a chi-squared statistic under the null of no mis¬ 
specification. All three models are rejected on the full sample at con¬ 
ventional levels of confidence, but on a one-third sample the random 
effects model is accepted. The higher rate of rejection for the larger 
sample size is no doubt a result of Lindley’s paradox. Thus of the 
three models considered, the random effects model evidences the 
least amount of misspecification. The poor result for the fixed effects 
model may be a result of its inconsistency, though we found little 
difference between it and the random effects model in the likelihood 
ratio test (as mentioned above). 

While the evidence thus far suggests the random effects model as 
the best model, its mixed performance under the information matrix 
test leads us to calculate a goodness-of-fit test for the log normality 
assumption for the wage. We divide the log wage distribution into 70 
equally spaced intervals and compare the empirical distribution of the 


** Of course, the likelihood ratio test in this awe is strictly invalid because the fixed 
effects estimates are inconsistent. 
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TABLES 


Information Matrix Tests for Maximum Likelihood Models 



No Effects 

Random Effects 

Fixed Effects 

Full sample x 2 

909 

478 

2,584 

One-third sample x* 

605 

198 

930 

Critical value (5%) 

201 

266 

86 

Degrees of freedom 

170 

230 

66 


data of workers with the predicted (nonnormal) distribution implied 
by the fitted parameters. Under the null that the distributional as¬ 
sumption is correct, a transform of the differences in the distributions 
is distributed chi-squared with 70 degrees of freedom (Kendall and 
Stuart 1977, p. 381). 21 The results indicate that the random effects 
model is rejected at the 95 percent level on the whole sample (xlo - 
287) but somewhat less so on a one-third sample (x 2 o — 212; critical 
value = 91). However, as figure 2 indicates for the full sample, the fit 
is fairly close overall, but the fitted distribution is shifted slightly to the 
right near the mode. 

Given this failure of the normality test, we partially relax the nor¬ 
mality assumption and compute semiparametric estimates of the ran¬ 
dom effects model by replacing the bivariate normal distribution of 
the two individual effects with a discrete distribution. The probability 
mass at each point of the distribution is estimated, and searches over 
the locations and numbers of mass points are undertaken. Table 4 
shows the results of our best model, that with 25 mass points. 22 The 
results show an unemployment rate coefficient of - .0049, somewhat 
below that obtained in the maximum likelihood random effects model 
(- .0066) but still negative and significant. The other coefficients are 
close to their random effects values as well. The cross-equation corre¬ 
lation of transitory errors is strong and negative (p 4 = - .432), and 
the composite correlation of the permanent and transitory errors is 
— .36 . 2s Thus while our assumption of normality in the maximum 


81 If the number of estimated parameters were subtracted from the degrees of free¬ 
dom, the statistic would be distributed as chi-squared with 49 degrees of freedom. The 
correct statistic has degrees of freedom between 49 and 70. We use 70 in the text. The 
regressor set is integrated out to compute the predicted distribution. 

28 The 25 mass points arise from the combination of five mass points for each of the 
individual effects. We experimented with up to six mass points; with six, most of the 
added points were estimated to have zero mass. Our search over the location of the 
points was informal (i.e., without a formal optimization routine). Accordingly, the 
standard errors in table 4 should be considered only to be approximate. Our procedure 
differs from that of Heckman and Singer (1982) by our not optimizing over die num¬ 
ber of points and our doing so only informally over the location of the points. 

** The estimate of p, is slightly negative, contrary to our prior results. This estimate 
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Fig. 2.—Fitted and actual log wage distribution 

likelihood random effects model appears to have resulted in estimates 
that imply somewhat too little procyclical bias in OLS, the estimated 
direction of bias is the same. Hence, our central findings appear to be 
robust to this specification. In our subsequent sensitivity tests, we shall 
continue to employ the random effects model because it is computa¬ 
tionally much simpler than the semipara metric model and because it 
gives very similar results. 

In table 5 we provide a large number of sensitivity tests for the 
model, showing both OLS and maximum likelihood random effects 
results. 24 Changing the definition of the wage by using different 
hours-worked variables for salaried workers (WAGE2 and WAGES) 
and by using the Ohio State “key wage” (WAGE5) makes little differ¬ 
ence to the random effects or OLS unemployment rate coefficients. 

of Ps has ten credibility than that from our previous model because, as Heckman and 
®ngcr (1982) showed with a Monte Carlo exercise, nonparametric estimates of the 
**ror distribution are extremely poor. However, like Heckman and Singer, we find that 
^coefficient estimates are robust. 

The no-effects maximum likelihood model was also estimated for all these ahema- 
Uve * Iw multi were unchanged. 
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TABLE 4 

Skmipahametric Random Effects Maximum Likelihood Estimates 



Log Wage 


Employment 

u 

-.0049** 


- .069** 

TIME 

(.0024) 

.013** 


(.013) 

-.013** 

EDUC 

(.001) 

.047** 


(.006) 

.060** 

EXP 

(.001) 

.023** 


(.007) 

-.003 

EXPSQ 

(.001) 

-.069** 


(.010) 

-.108** 

WHITE 

(.005) 

.177** 


(.042) 

.226** 

KIDS 

(.012) 


(.057) 

.080** 

WIFE 



(014) 

.396** 

Constant 

.168** 


(.035) 

.573** 

P 4 

Log likelihood function 

Derived from estimates 
of weights: 

Pi 

Ps 

Pa 

(.025) 

-.432** 

(.043) 

.272** 

(.002) 

-6,283 

.277 

.269 

-.168 

(121) 


Not*.—S tandard rrrorr nc in patentheaci. Notiparunetrk bivarutr diitributkm of (m. 4*,) with H m»»» pomw u 
ihr five poinu (0. £.35. ± 75) for ndt variable. Estimated weight* at each point are not *hown. 

** Significant at the 5 percent level. 


We also tested the inclusion of overtime on the subset of years for 
which overtime data are available (not shown). We found that there 
was no change in the unemployment coefficient, probably because 
very little overtime was reported. However, employing an annual 
wage measure from the year previous to that of the interview 
(WAGE4) lowers both random effects and OLS coefficients drastically 
and leaves them insignificant This is evidence that annual wage mea¬ 
sures result in downward-biased estimates, as we discussed previously, 
and indicates that our point-in-time measure of the wage is a prefera¬ 
ble measure. Table 5 also shows the results of using the producer 
price index (PPI) instead of the consumer price index (CPI) to test the 
findings of Geary and Kennan (1982), whose work indicated that 
employing die PPI changed a countercyclical effect to an acyclic one. 
In every case in our results the PPI makes the procyclical OLS or 
random effects coefficient larger, in some cases by a nontrivial mag¬ 
nitude. This change is in the same direction as that found by Gea^ 








TABLE 5 

Ordinary Least Squares and Maximum Likelihood Random Effects 
Unemployment Rate Coefficients under Alternative Specifications 



OLS 

Maximum Likelihood 
Random Effects 

Wage definition:* 

WAGE1: 

CPI 

-.0096** 

-.0066** 

(.0036) 

(.0023) 

PPI 

-.0128** 

- .0104** 

(.0037) 

(.0024) 

WAGE2; 

CPI 

- .0067** 

-.0047* 

(.0037) 

(.0024) 

PPI 

-.0109** 

-.0084** 

(.0037) 

(0025) 

WAGES: 

CPI 

-.0066* 

-.0038* 

(.0035) 

(.0022) 

PPI 

-.0108** 

-.0075** 

(.0035) 

(.0023) 

WAGE4: 

CPI 

-.0012 

.0007 


(.0041) 

(.0027) 

PPI 

-.0114** 

-.0104** 

(.0041) 

(.0027) 

WAGE5: 

CPI 

-.0072** 

-.0041* 


(.0036) 

(.0022) 

PPI 

-.0115** 

- .0078** 


(.0036) 

(.0023) 

Cyclical indicators : b 

V 

-.0096** 

-.0066** 


(.0036) 

(.0023) 

Capacity utilization 

.0024** 

.0016** 

(.0010) 

(.0007) 

Index of coincident 

.0024** 

.0020** 

indicators 

(.0006) 

(.0005) 

Index of industrial 

.0025** 

.0022** 

production 

(.0007) 

(.0005) 

Fraction of population 

.0099** 

.0053** 

employed 

(.0042) 

(.0032) 

Total employment 

.0045* 

.0019 

(.0029) 

(.0017) 

Identical regressors 

ln both equations' 


-.0075** 

(.0024) 


Not*.— Standard erron ire in parentheses. , 

Spedficuioo (*). cw - consumer price index. PPI - producer price index. WACEI: Repented wage lor 
™ rl T *®rken; fee salaried workerv computed u current straight-time earnings divided by either current stnigm- 
™e tour, nr utuii hours of work; usual hours are employed only when straight-dme hours are mi avadaMe. 
WACEJ: Same as WAGE!, but overtime hours not subtracted from hours measure m vears when avatlay. 
wAGt S: Same as WAGE1, but usual houn always utiHied. WAGEd. Computed by dividing prevfow ■ I* «*"» 
by previous IP months' houn. for aO worken (not just for salaried), using prevtous rear * CPI or PPI. 
b y *.” 1 * 0 *«« "My wage." insumfed m meaaure straigfat-dme hourly wage rate. 
jSpwcatioofg). 

sod WRITE omitted from employment equation. 
jjjNficam at the 10 percent level. 

«the J percent level. 
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and Kennan, although in our case the sign of the coefficient is un¬ 
changed by which price index is utilized. 

Also shown in table 5 are the effects of using different measures of 
the cycle: capacity utilization, production, and employment indi¬ 
cators. These measures are in most ways preferable to the unemploy¬ 
ment rate, for they are based on direct measures of real activity rather 
than on subjective answers to survey questions about time spent in job 
search. However, the new measures give significant coefficients of the 
expected sign and hence are consistent with the unemployment rate 
estimates, though their magnitudes cannot be compared. Note also 
that a procyclical bias in the OLS coefficient estimate is found for all 
five of the alternative cyclical indicators, confirming our previous re¬ 
sults. As is also shown in the table, altering the specification to force 
identical regressor sets in both the wage and employment equations 
has no effect on the results. 

Table 6 shows the effects of several other specifications of interest. 
Splitting the sample by time period gives evidence of considerably 
larger procyclical effects, though not differing significantly between 
the subperiods. This finding may be of interest for future research. 
We should note as well that the bias in OLS appears to be much 
greater in the earlier than in the later subperiod. Adding a 1-year lag 
in the unemployment rate makes the contemporaneous effect much 
more procyclical than before. Adding the inflation rate over the 12 
months prior to the interview date to the regression increases the 
estimated procyclicality of the unemployment rate and of an index of 
coincident indicators, both by about one standard error. Also, the 
inflation rate coefficients in these equations are significantly negative 
(- .0040 and - .0054). This finding of real effects of the inflation rate 
is of interest, for in an equilibrium labor market the inflation rate 
should have no effect on real wages. Finally, testing for the en¬ 
dogeneity of the unemployment rate by instrumental variables lowers 
the cyclical effect considerably and leaves the coefficient insignificant. 
However, the same instrumental variables technique applied to a cy¬ 
clical indicator of real activity (the index of industrial production) has 
no effect on its coefficient. 25 Since the unemployment rate is a rather 
poor measure of real activity, we give more credence to the latter 
estimates. We conclude that our results are reasonably robust to these 
alternative specifications. 

tV, Aggregate and Sectoral Comparisons 

Thus far the comparisons have been made between micro estimates 
from OLS equations and those from maximum likelihood models. 

“ No corrections to the standard errors have been made. 
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TABLE 6 

Unemployment Rate Coefficients: Additional 
Maximum Likelihood Random Effects 
Specifications 


Coefficient 


Subperiod estimates: 

1966-71 

-.0122** 

(.0060) 

1973-81 

-.0130** 

(.0025) 

Lagged unemployment rate: 

V, 

-.0104** 

(.0028) 

u,. t 

.0089** 

(.0029) 

Inflation control:* 

U 

-.0095** 

(.0029) 

Index of coincident 

.0026** 

indicators 

(.0005) 

Instrumental variables. 

u 

-.0025 

(.0047) 

Index of industrial 

.0027** 

production 

(.0006) 


Nort—Sundard errors are in parentheses. Specification (2). 

* The annual CPI inflation rate of (he 12 months previous to the 
interview date is used as an additional independent variable. 

* Instruments in each case: real energy price, real money 
growth, trend, and 1-year lag in V or industrial production index. 

•* Significant at the 5 percent level. 

Another issue relates to the difference between micro data estimates 
and those from aggregate time-series regressions. Not a great deal of 
investigation of this question is possible with this data set since only 12 
time-series observations are available. But they can be examined to 
determine their consistency or lack of consistency with the micro es¬ 
timates. 

Table 7 shows the results of simple wage regressions on the aggre¬ 
gate data. The first column shows the result of regressing the condi¬ 
tional wage means in our data in each year on the annual unemploy¬ 
ment rate. Since the wage means are based only on the working 
sample in each year, they are biased estimates of the true means of the 
wage distributions for the reasons we have discussed. The unemploy¬ 
ment rate coefficient of - .0074 is quite close to the OLS estimate in 
the micro data, as should be expected, but naturally has a much larger 
standard error. To examine directly the effect of selection bias on this 
a 8gregate estimate, we use our maximum likelihood random effects 
model to estimate the means of the untruncated wage distribution in 
each year and to regress these on the annual unemployment rate and 
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TABLE 7 

Aggregate Time-Series Log Wage Regressions 




Dependent Variable 


(1) 

Maximum 

Likelihood 

Random 

Effects 

Wage Means'* 

(2) 

W, c 

(All Workers) 

<s> 

(Manufacturing) 

(4) 

u 

-.0074 

-.0043 

.0019 

~.0000 d 


(.0106) 

(.0098) 

(.0121) 

(.0093) 

TIME 

.024** 

.011** 

.0009 

.0046 


(.004) 

(.004) 

(.0042) 

(.0033) 

Constant 

.975** 

- .093** 

1.0030** 

1.0452** 


(.048) 

(.044) 

(.0456) 

(.0383) 


Note.—S tandard error* are in parentheses. Number of observations is 12. The regressions in die first two 
columns are adjusted for heteroscedastkiiy. 

* is the conditional log wage means per year from the NL5 data. 

b Values of the dependent variable are obtained from estimation of coefficients on lime dummies in maximum 
likelihood random effects specification (2) when the trend and V are replaced by such dummies 
c Source: Employment and Training Report of the President (1982, p. 245). 

** Less than .00005 in absolute value. 

•• Significant at the 5 percent level. 


a trend. The results, shown in column 2, indicate an aggregate effect 
of - .0043, less than two-thirds the size of the OLS effect. The stan¬ 
dard error of the coefficient is much greater, of course, since there 
are only 12 observations. In fact, Coleman (1984) has pointed out that 
this standard error is likely to be closer to the true one if there are 
unobserved time-specific error components in our micro data model, 
which we assumed were not present. 26 In any case, the aggregate 
results indicate that a significant procyclical bias is present if the selec¬ 
tion of workers into and out of the work force is ignored. 

To compare our results more closely with those obtained in prior 
studies using aggregate data, we report in columns 3 and 4 the results 
of regressions using the wage measures published by the Bureau of 
Labor Statistics for all workers and for manufacturing production 
workers. The unemployment effect for all workers is, surprisingly, 
positive—though small in magnitude and insignificant—and the ef¬ 
fect for manufacturing workers is negative but very close to zero. The 
difference between these results and our main results could arise 
from a number of differences between the published wage measures 
and our main wage measure. First, the published wage measures are 
annual averages, whereas our measures are taken at a point of time. 

16 However, our data do have some cron-sectional variation, for the interview month 
varied across the sample, and we employed die unemployment rate as of that month 
Thus our true standard errors should be less than those in table 7. 
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As we discussed earlier, use of an annual measure should bias the 
unemployment rate coefficient toward zero. This bias appears to be 
quantitatively important, as we found in table 5 with the annual (CPI- 
deflated) wage measure, WAGE4. Second, Stockman (1983) has 
pointed out that the aggregate wage series is hours-weighted, 
whereas, of course, our micro data are not. However, this does not 
appear to be a significant source of difference, for estimates of col¬ 
umn 1 using hours-weighted wage means from our data show an 
unemployment rate coefficient very close to the unweighted one. 
Third, the published wage measures are taken from the establish¬ 
ment survey and hence do not cover the entire universe of the U.S. 
work force. On the other hand, our data also apply only to young 
men, who may be different from other workers. Gauging the impor¬ 
tance of this difference would require a detailed analysts of the micro 
data underlying the bureau’s wage measures, which is beyond the 
scope of our paper. 

The complete lack of cyclicality shown in the aggregate manufac¬ 
turing wage, the wage most frequently employed in the aggregate 
literature, may also be a result of self-selection into and out of manu¬ 
facturing from nonmanufacturing (Heckman and Sedlacek 1985). 
The proper method of accounting for this double selectivity is to 
specify a three-way choice equation between manufacturing, non¬ 
manufacturing, and unemployment. However, such a model would 
be very difficult to compute with panel data, so we simplify it by 
specifying a single selection equation that gives the probability of 
locating in the manufacturing sector rather than in either of the two 
alternatives. The correlation between the error term in this selection 
equation and that in a manufacturing sector wage equation repre¬ 
sents the combined effects of movements of manufacturing workers 
into and out of the work force as well as into and out of nonmanufac¬ 
turing employment. We also estimate a wage equation for the non¬ 
manufacturing sector jointly with a selection equation for the proba¬ 
bility of being employed in nonmanufacturing rather than being in 
the two alternatives. The correlation of the error term in this equation 
with that in the nonmanufacturing wage equation is to be interpreted 
analogously with that in the manufacturing case. 

The results are shown in table 8. The estimated true unemploy¬ 
ment effects in both manufacturing and nonmanufacturing are nega¬ 
tive and significant (- .0094 and - .0063, respectively). However, in 
contrast to our economy wide results, comparison of the maximum 
likelihood random effects results with the OLS estimates indicates 
while the estimated unemployment effect in nonmanufacturing 
is still procyclifcally biased, the direction of bias in the manufacturing 
^tor is countercyclical. Thus we have found that during cyclical 
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TABLE 8 

Estimates by Sector 



Manufacturing 

Nonman ufactu ring 

OLS 

Random 

Effects 

OLS 

Random 

Effects 

u 

-.0069 

- .0095** 

-.0089* 

-.0065* 


(.0051) 

(.0034) 

(.0048) 

(.0034) 

Pi 

« • • 

.531** 

, , , 

.569** 



(.013) 


(.007) 

Pa 


.691** 

. . » 

.628** 



(.006) 


(.007) 

P» 


-.304** 


.431** 



(.044) 



P« 

. . » 

-.138* 

. . , 

.219** 



(.077) 


(.067) 


Non..—Standard error* are in parentheses. Specification (2). Tou) sample size m 11,886; employed in manufac¬ 
turing, $.641; employed in nonmanufacturing, 6,869. 

* Significant at the 10 percent level. 

** Signficant at the 5 percent level. 


downturns, low-wage workers tend to leave manufacturing, driving 
the conditional wage mean upward; but in the economy as a whole, it 
is high-wage workers who are most likely to leave employment, and 
the wage is biased in a procyclical direction. These results are virtually 
identical to those of Heckman and Sedlacek (I985). 27 

Though the unemployment rate coefficients in the employment 
equations are not shown in table 8, they indicate that, interestingly, 
employment in manufacturing is procyclical but employment in non¬ 
manufacturing is acyclic. This suggests that workers leaving manufac¬ 
turing may be entering nonmanufacturing and that part of the procy¬ 
clical bias in nonmanufacturing may result if these workers enter the 
lower part of the nonmanufacturing wage distribution. This would be 
again identical to the Heckman-Sedlacek results. 

We should also note that our panel data allow us once more to 
determine the correlation of transitory wage shocks and employment 
(p 4 ), and we find once more that, even in manufacturing, those with 
high transitory wage draws are more likely to be unemployed. 28 This 
is consistent with a long-term contract story, such as that of Fischer 
(1977), in which those with high transitory wages are likely to be those 
whose contracts have locked them into a fixed nominal wage, which, 


* 7 Heckman and Sedlacek specified a structural, three-sector choice equation, 
whereas we specify only two reduced-form choice equations. 

** The unemployment rate coefficient in the nonmanufacturing employment equa¬ 
tion is actually positive (though small and insignificant). The positive p< for non* 
manufacturing therefore indicates that those with high transitory wage draws are most 
likely to be unemployed (see eq. [7] with y t > 0). , 
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after a negative inflation shock, forces their real wage above their 
marginal product. 

V. Summary 

In this paper we have used micro data to determine the effect of the 
business cycle on the real wage. We have adjusted for the selection of 
workers of different wages into and out of the work force over the 
cycle by constructing a fixed-weight wage index that includes non¬ 
workers. We find that the selection effect biases the behavior of the 
real wage in a procyclical direction, that, on average, high-wage work¬ 
ers are more likely to become unemployed in a downturn. This effect 
is offset in part by a greater tendency of those with observed charac¬ 
teristics associated with low wages (e.g., low levels of education and 
experience) to leave employment during downturns, thereby biasing 
the aggregate wage in a countercyclical direction. The procyclical 
bias, which arises from the higher rates of nonemployment for those 
with higher unobserved individually heterogeneous wage compo¬ 
nents, is sufficiently large to outweigh the effect arising from ob¬ 
served heterogeneity. Our results also show that the true effect of the 
cycle on real wages, after adjusting for both observed and unobserved 
heterogeneity, remains procyclical in sign, though much smaller in 
magnitude than previous micro data estimates have indicated. We 
also find that our results are robust to different specifications of the 
regressor set, the stochastic specification, the distributional assump¬ 
tion, the measure of the wage and cycle, and the use of an instrument 
for the cyclical indicator. 

Other subsidiary findings are also of interest. For example, our 
results indicate that those with high permanent wage components are 
less likely to leave employment during cyclical downturns but that 
those with high transitory wage components are more likely to do so; 
the latter effect dominates overall. We also find that the wage in 
manufacturing is countercyclically biased and that manufacturing 
workers with low permanent wages (but, once again, high transitory 
wages) are more likely to become unemployed. The procyclical bias in 
other sectors of the economy dominates the manufacturing effect to 
generate an overall procyclical bias. 

Appendix A 

Likelihood Functions 

The likelihood function for the pooled cross-section, time-series model with 
no individual effect is 

1= 2 logP*+ X < A1 > 

workers non worker* 



126 s 

where 
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fc(ln W u - X„p, Vi,; p )dvu. 

(A2) 

-z*y 

f-Znn 

Qu = J f(v*)dv u , 

(A3) 


and b(t, v; p) is the bivariate normal density with correlation p. There are no 
identification restrictions on X* and Z«. 

The likelihood function for the random effects model contains multivariate 
normal probabilities of up to T, dimensions for each individual i (in our data 
Ti is as high as 12). Although such integrals are typically computationally 
intractable, our restriction on the serial correlation structure of the model 
allows us to employ a well-known simplifying factorization (Heckman 1981). 
By conditioning on and integrating out the permanent effects p., and i|i„ we 
can reduce the order of integration to two. The log likelihood function comes 
down to 

L - X l°g(p r °bi). (A4) 

i 

where 

F-A-r nrun Q*Vi (Pi. Ps)dpid«|r„ (A5) 

J-x J-«\ ieB ) 


and 


Pu 





Min - X„p - p„ <*>„; p 4 )dw tf , 


Q* = I /(«*,)<&>„. 


(A6) 


(A 7) 


Here A, is the set of t for which individual i is a worker, B, is the set of l for 
which he is a nonworkcr, fa is the bivariate normal density of p, and t|i„ and fa 
is the bivariate normal density of t)* and The integral can be numerically 
evaluated by extending a relatively efficient quadrature technique developed 
by Butler and Moffitt (1982). 

The likelihood function for the fixed effects model is obtained by replacing 
p< by d u and fa by ifa, to distinguish fixed effects from random effects and by 
conditioning on the probability of sample inclusion: 

L - 2 2 ^8^ + £ 2 ,0 8& “ Z lo S R " 

i t € A , t t € iB , » 

where 

P* - f 6(ln W„ - Xi,P - di„ w*; p*)dw* 

J-z.T-a,, 


(AS) 

(A9) 


Qm “ | ** /(«*)<&*, (AI0) 
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*.f 1 - n <*- qm) ~ n &• 


(All) 


We have retained the parameter p 4 to denote the cross-section correlation of 
t|i, and <•)*. The term R, equals the probability that an individual is not a 
worker or a nonworker for the entire set of time periods. 


Appendix B 


TABLE B1 
Sample Exclusions 


Total Number Total Number 


of Person of Person-Year 
Observations Observations 


Original NLS sample 5,225 62,700 

Sample remaining after exclusion if:* 

Full-time student, self-employed, working with¬ 
out pay, or piece-rate worker 5,126 39,999 

Sample remaining after exclusion if: b 
.Schooling or military experience unfinished as 
of interview ... 29,296 

Hours of work per week > 90 ... 29,178 

Number of children >9 ... 29,177 

Number of household members >12 ... 29,127 

Nonlabor income > $20,000 or < - $2,000 ... 29,118 

WAGE1 < $0.50 or a $12.50 or missing' ... 27,828 

WAGE3 < $0.50 or a $12.50 or missing 1 ... 27,786 

Age <21 . .. 24,617 

Missing completion date for school or military ... 24.598 

Missing interview month ... 24,597 

Missing marital status data ... 24,583 

Missing education data ... 24,184 

Missing headship data ... 24,184 

Missing industry code ... 24.175 

Missing occupation code ... 24,151 

Missing WAGES' ... 24,127 

Missing WAGE2 1 ... 23.931 

Missing WAGE4 1 4,439 23,927 

Final sample 4,439 23,927 

Half-sample d 2,219 11,886 

Fixed effects sample 723 4.581 


‘ Individ utl exduMoni •it not counted, though moil apparently arc due to full-time undent exclusion. 

**9»*emial ordering. Screening criteria resulting in tern euluuont are not damn. Since ntoa rxflniiont are 
, counu pmon-tpeeific excluumu are not obtained. "Mining" wage meant mining if employed, 
a-u* for w ** e defintonm. 

Haitxinipie" » not exactly one-half brtaute it »a* rontiructed by taking half of the people (croawecuonal 
) rather than by taking half the pemn-year obaervatiora. 
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Appendix C 

Information Matrix Teat 29 


The information matrix test determines whether 



that is, whether the expected values of the elements of the sum of the Hessian 
of the likelihood function and the outer product of the gradients of that 
function are equal to zero. If there are A parameters in the model, then there 
are q = A(A + l)/2 unique elements per observation in the sample version of 
the test. Since this test is just a set of conditional moment restrictions, we can 
form a conditional moment test (Newey 1985) if we can supplement these 
restrictions with an appropriate set of orthogonality conditions. In this case, 
the orthogonality conditions are that the expected values of the elements of 
the gradient of the conditional likelihood function, evaluated at the true 
parameters, are zero. Thus nR 2 from the regression of a column of ones on 
the sample values of all these orthogonality conditions and conditional mo¬ 
ment restrictions for each observation is distributed x? under the null hy¬ 
pothesis of correct model specification. 90 

Unfortunately, the conditional moment restrictions in this test are often 
collinear when the number of parameters is large. When this occurs, we 
cannot use a regression to evaluate the conditional moment test. However, 
the above-mentioned nR 2 is the same as computing ng(0)'Wg($), where g(0) 
is the sample average vector of the orthogonality conditions and the condi¬ 
tional moment restrictions and W is the inverse of the covariance matrix of 
that vector, which we denote S. This statistic is distributed as x? under the null 
hypothesis of correct specification if S is of full rank. If the rows of gift) are 
collinear, then S will be singular, and thus the statistic cannot be computed. 
However, an extension of the analysis of Hausman and Taylor (1982) implies 
that we can replace W with a generalized inverse of S. In this case, the test 
statistic has r = frank(S) - A] degrees of freedom. In fact, Newey (1983) 
shows that any generalized inverse can be used. 
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The Structure of Simple General Equilibrium 
Models with Frictional Unemployment 


Carl Davidson, Lawrence Martin, 
and Steven Matusz 

Michigan State University 


We develop a two-sector general equilibrium model in which equilib¬ 
rium unemployment arises endogenously because of trading fric¬ 
tions in the labor market of one sector. Externalities inherent in the 
search process lead to inefficient equilibria, and this has important 
implications for the basic structure of the economy. In particular, 
the relationship between factor rewards and commodity prices is 
fundamentally different from the analogous relationship in a fric¬ 
tionless economy. One implication is that the economy's relative sup¬ 
ply curve may be downward sloping, especially when the search 
sector is small. We also present several applications of the analysis. 


I. Introduction 

Because of its simple structure and intuitive appeal, the two-sector 
general equilibrium Walrasian model has served “as the workhorse 
for most of the developments in the pure theory of international 
trade” (Jones 1965, p. 557). This model has also been used exten¬ 
sively in most of the other applied fields of economics. In his seminal 
article “The Structure of Simple General Equilibrium Models,” Jones 
contributed to our understanding of this model by exposing its basic 
structure in a manner that allowed him to unify the many approaches 
that had been developed in different applied areas. In particular, he 
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examined two key relationships that govern the behavior of the sup¬ 
ply side of a competitive economy. The first relationship links factor 
endowments, factor rewards, and output levels and is derived from 
the requirement that factor markets dear. The second relationship 
is between factor rewards and commodity prices and results from 
the fact that unit production costs must equal output prices in the 
presence of constant returns to scale. By carefully examining these 
relationships, it is possible to derive several of the fundamental theo¬ 
rems of international economics (e.g., the Rybczynski and Stolper- 
Samueison theorems). In addition, the effects of parametric changes 
in the economy (e.g., tax rates) can be easily understood by examining 
how such changes work through these two basic relationships. 

While the Walrasian model is exceedingly useful for many types of 
analysis, one may be reluctant to rely on it to analyze situations in 
which unemployment is a consideration: factors of production are 
always fully employed in the full-information, frictionless markets. 
Since casual observation suggests that the effects of a policy on the 
unemployment rate and the welfare of the unemployed are often 
major considerations for policymakers, it is vitally important to aug¬ 
ment the Walrasian approach in a way that permits the investigation 
of these issues. The purpose of this paper is to do just that. In particu¬ 
lar, we present and analyze a simple two-sector general equilibrium 
model that allows for equilibrium unemployment. Our main goal is to 
understand how our model differs in structure from the model ana¬ 
lyzed by Jones (which we refer to as the Jones model). 1 Our analysis 
indicates that introducing frictional unemployment creates the poten¬ 
tial for the structural relationships within the economy to be quali¬ 
tatively different from the analogous relationships in a frictionless 
economy. 

We model unemployment by assuming that in one sector, factor 
markets are frictionless so that the duration of unemployment is zero, 
while in the other sector, idle factors of production must search each 
other out in order to produce. 2 Factors are mobile across sectors and 

1 More accurately, we compare our model with the standard two-sector, two-factor 
model using an approach that closely resembles the approach adopted by Jones in h» 
masterful exposition and synthesis of earlier work. Our reference to the “Jones model” 
is not intended to imply that Jones was the first to formalize, develop, or use the model. 

*This approach was originally suggested by Friedman (1968) in his classic paper 
“The Role of Monetary Policy,” in which he defined the natural rate of unemployment 
as “the level that would be ground out by the Walrasian system of general equilibrium 
equations, provided there is imbedded in them the actual characteristics of the labor 
and commodity markets, including market imperfections, stochastic variability in de¬ 
mands and supplies, the cost of gathering information about job vacancies and labor 
availabilities, the costs of mobility, and so on." This definition has recently been made 
operational by Diamond in a series of papers (1981, 1982, 1984a, 1984&) in which he 
models unemployment as the result of problems in coordinating exchange. He accom- 
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in equilibrium distribute themselves so that the expected lifetime re¬ 
turn is the same in both sectors. The trading frictions in the search 
sector represent the only difference between our model and the Jones 
model, and yet we demonstrate that in some cases this leads to an 
economy with a surprisingly different structure. In particular, we 
show that the relationship between factor rewards and commod¬ 
ity prices is fundamentally different in an economy with search¬ 
generated unemployment. This leads to the possibility of downward- 
sloping relative supply curves, and, in fact, we demonstrate that such 
a phenomenon is likely to occur if the search sector is relatively small. 
If, however, the search sector is large enough, the structure of the 
economy will be qualitatively the same as the structure of the fric¬ 
tionless general equilibrium model. 

In the next section, we introduce the model, define equilibrium, 
and briefly discuss its efficiency properties. Equilibrium is generally 
not efficient, and this is one of the driving forces behind our results. 
In Section III, we derive the two fundamental relationships that de¬ 
scribe the structure of the model and compare them with their 
counterparts in the Jones model. In order to gain some insight into 
the forces behind our results, we devote Section IV to a detailed 
investigation of the nature of the production process in the search 
sector. In this section, we demonstrate that the equilibrium factor 
intensity employed in the search sector is not optimal. Changes in 
factor prices alter this factor intensity and may enhance or inhibit 
efficiency. When the search sector is small these efficiency effects are 
large enough to dominate the traditional mechanism linking com¬ 
modity and factor prices. As the sector grows in size, this effect 
shrinks relatively and the economy eventually behaves in a manner 
similar to a pure Walrasian economy. Some applications of the analy¬ 
sis are discussed in Section V, and concluding remarks are offered in 
Section VI. 

II. The Model 

A. Endowments, Production Technology, 
and Preferences 

Our economy consists of two sectors (X and Y) and two factors of 
production (A and B). Each factor is endowed with one (indivisible) 

plishes this by assuming that it takes time and effort For potential trading partners to 
find each other. Our approach is therefore very simitar to Diamond’s. However, his 
purpose in developing these models was to investigate the macroeconomic properties 
yf economies with a natural rate of frictional unemployment, and he was able to do so 
® a remarkably simple setting: a barter economy with one good and one factor of 
Production. The questions that we wish to address are fundamentally different and 
" s t uire a somewhat more elaborate model. 
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unit of leisure in each period that she may either consume or offer as 
labor. Each worker is finitely lived, although the age of a worker at 
death is a random variable. The probability that any individual 
worker of either type dies in any given period is time invariant and 
equal to d, which also equals the birth rate for both types of workers. 
All workers are risk neutral. 

We assume that the production function corresponding to good Y 
is twice continuously differentiable and characterized by constant re¬ 
turns to scale: 


Y = Y l,,,), (1) 

where L^ is the amount of type i labor employed in the production 
of Y. 

Finally, we assume that the production of one unit of X requires 
exactly one worker of each type. We refer to a pairing of opposites as 
a “match” and designate the number of matches as L m , so that X = L m . 


B. The Search Process 

Workers in sector Y are immediately hired in full-information, com¬ 
petitive auction markets, while workers who seek employment in sec¬ 
tor X must search each other out. if a type i worker in sector X fails to 
locate a worker of the opposite type, she remains idle (unemployed) 
for that period. A worker in this sector engages in search at the start 
of every period in which she is unemployed. Matches survive as long 
as both workers live. On the death of either partner, the survivor 
again engages in search. 3 

We assume that the number of new matches created every period 
(LJ is a function of L** and the numbers of workers of each type 
who begin the period searching. This function is characterized by 
constant returns to scale, positive marginal products, and symmetry 
such that L„(L av £,**) = E«). 4 In this case, the per period proba¬ 

bility that a type i searcher finds a match can be written as e,(s), where ? 
denotes the proportion of searchers who are of type A. Our assump- 


5 Consistent with the bulk of the literature in this area, deaths are used to capture the 
role of exogenous separations in factor markeu (see, e.g., Diamond 1981, 1982; Mor- 
tensen 1982; Pissarides 1984). 

4 Empirical support for the assumption of constant returns to scale is provided by 
Nicked (1979). The implications of increasing returns to scale in the search technology 
are discussed at length in Diamond (1984b). The substance of our results docs not 
depend on the symmetry assumption. In fan, in n. 16 we show how our results can be 
generalized to situations in which the matching function is not symmetric. 
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lions about the search technology imply that ^( 5 ) < 0 < e' 6 (s) and that 
efc) = «,<! - s) for j * t. 5 


C. Factor Returns 

Since sector Y markets are frictionless and competitive, all workers in 
this sector are always employed and earn the value of their marginal 
product. If we let denote the wage paid to a type i worker in this 
sector and P y the price of good Y, then this condition is expressed as 

PyYi = Wiy for i = a, b, (2) 

where Y, is the partial derivative of Y( ) with respect to L v . Letting 
denote the expected lifetime income to a type i worker in sector Y, we 
have 


Y«» — —£~ for i = a, b. (3) 

We assume that the proceeds from the sale of a unit of X are 
divided between the two matched workers who produced it according 
to the Nash cooperative bargaining solution so that the surplus 
created by the match is evenly split. 6 For future use, we let a, denote 
the share of the proceeds that go to the type i worker and P x the price 
of X. 

To describe the solution to the bargaining problem formally, let V„ 
denote the expected lifetime income to a type i worker currently 
searching in sector X and the expected lifetime income to a type i 


5 We first note that e„ the probability that a randomly chosen type«searcher becomes 
matched, is equal to LJLu, where L n (L w L tl ) £ min(Z„, L tl ). Dividing the numerator 
and denominator by the total number of searchers (L„ + L h ) permits us to express t, 
solely as a function of s, the proportion of searchers who are type A agents; i.e., j = 
+ /.*,). Next, symmetry follows because 


*.(*) = 


L«(.t, 1 - s) L„(\ - j, i) 


= r*(l - s). 


Moreover, 


<d) “ (7)^-1 - sL *- N )= - ” <0. 

where the numeric subscript refers to partial differentiation and the last equality fol¬ 
lows from the assumption of constant returns to scale. Finally, we note that constant 
returns to scale in the matching technology implies e.(s>/e*(j) = (1 - j)/j for all s. 

This assumption is consistent with a bargaining process in which the agents ex- 
ritange sharing rules until one agent makes an offer that is acceptable to her partner, 
®nd the time it takes to make counteroffers is arbitrarily dose to zero. See the work of 
““more (1982), McLennan (1982), Rubinstein (1982), and Rubinstein and Wolinsky 
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worker currently matched in sector X. Then, under the assumption 
that unemployed workers earn no income, 7 


V* = e,{s)V im + [l - e,{s)](l - d)V u , (4) 

V^n - OiP x + (1 - dfV* + d( 1 - dWv for i - o, b. (5) 

In (4), [1 - e,(s)]( 1 - d) represents the probability of failing to find a 
match but surviving to search again next period. In (5), (1 - d ) z 
represents the probability that both partners survive the period so 
that they begin the next period still matched and d( 1 - d) represents 
the probability that the type j worker will die and the type i worker 
will survive (in which case the type i worker must begin searching 
again). Equations (4) and (5) can be solved for and V**, to obtain 

e.QLiP* 

^ - #=-#=!& (6) 


[1 - (1 - d)( 1 - 
d[l - r(l - *,)] 


for i — a, b. 


(7) 


where the argument of e, has been suppressed and we have defined r 

» (1 - df. 

We are now' in a position to describe how a, is determined. As 
previously indicated, the Nash bargaining solution evenly divides the 
surplus created by the match. The surplus is the excess of expected 
lifetime income if matched over the next-best alternative, namely, 
waiting a period and searching again. The surplus generated for a 
type i worker is 

V in - (1 - d)V„ - -r for i = a, b. (8) 

1 - r(l - e,) 


Equating V am - (1 - with V** - (1 - d)V bs and solving for o d> we 
obtain 


1 - r(l - e a ) 

2 - r(2 - e a - e h ) ' 


(9) 


When output prices and the mix of searchers are given, this value of 
a„ solves the Nash cooperative bargaining problem. Of course, a* = 
1 - a a . 


D. Equilibrium 

In any steady-state equilibrium, the number of type i workers who 
enter sector k through birth must equal the number who exit the 

7 This assumption is made to keep the algebra as simple as possible; it is not essentia! 
for any of our results. 
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sector because of death. In addition, the number of searchers of each 
type and the number of matches must be time invariant, if we let {S, 
denote the proportion of newly bom type i workers who choose to 
seek employment in sector X, L, the number of type i workers, and L u 
the number of. type i searchers remaining after the matching process 
for the period has ended, then these conditions imply 

(l-m = {JiLjd, (10) 

L m — eL m + d(l - d)t{L m + QjLjtjd + Z»i,(l - d)tj. (11) 

In (10), L4 is the number of newly born type i workers and (L m + L u )d 
is the number of type i sector X workers who die each period. If this 
condition holds, the sector sizes remain constant over time. In (11), 
zL* is the number of matches that survive from the previous period; 

>s the number of new entrants who immediately find partners; 
d( I - d)e,L m is the number of type i workers who survived, had 
partners who died, and immediately found new matches; and L„( 1 - 
d)e, is the number of type i searchers who survived and found part¬ 
ners. These last three terms represent all workers who begin period t 
as searchers but end the period matched. 8 The sum of all four terms 
on the right-hand side yields the number of matches at time f, which 
must equal the number at time t — 1, given on the left-hand side of 
(11). If this condition holds, the number of matches and the number 
and mixture of searchers will be time invariant. 

In addition to these steady-state conditions, a nonspecialized pro¬ 
duction equilibrium for this economy is characterized by (i) zero 
profits in sector Y, (ii) indifference between sectors for both types 
of workers, and (iii) zero excess demand for labor of each type in sec¬ 
tor Y. 9 

The zero-profit condition for sector Y is given by 

Py = 4r,a >ay + ( J 2) 

where I# denotes the unit input requirement of type i workers in 
sector k. 

For a type i worker to be indifferent between sectors, the expected 
lifetime return to seeking employment in either sector must be the 
same. More exactly, since a worker cannot choose employment in sector 

8 In computing s, it is crucial to distinguish the number of workers who begin the 
period as searchers from Lu, the number who remain searching after the completion of 
the matching process. 

_ We should point out that, regardless of relative output prices, specialization in good 
* t* always a production equilibrium. This fallows from the fact that if everyone in the 
ec °nomy except one worker seeks employment in sector Y, it will be impossible for that 
"°rker to produce in sector X. Thus that worker will take a job in sector Y regardless of 
““ '’dative product prices. 



1274 JOURNAL OF POLITICAL ECONOMY 

X (she can choose only to search), the second equilibrium condition is 
given by 

Vr, = for » = a, b. (13) 

The final equilibrium condition is met when labor supply equals 
demand: 

L-i — Liy 4- Lis L m for i — a, b. (14) 

This completes the description of the model. We emphasize that 
the only difference between this and the Jones model is that search is 
required to find employment in sector X. If we set e t {s) - 1 for both i 
and for all s, the model would reduce to the standard model with 
Leontief technology in sector X. 

E. Efficiency 

Diamond (1982) and others have shown that when search is required 
to find trading opportunities, externalities are generated that lead to 
inefficiencies. In the context of the model presented here, it can be 
demonstrated that in equilibrium the search sector is too small and its 
factor intensity is too asymmetric (see Davidson, Martin, and Matusz 
1987). The reason for this is that whenever a match occurs, the two 
partners each enjoy an increase in expected lifetime income measured 
by V im - Vi, for i - a, b. However, when a worker contemplates 
entering sector X, she ignores the increase in her partner’s expected 
income that will be generated by every partnership that she enters. 
Workers therefore ignore a positive externality associated with enter¬ 
ing this sector so that it is too small in equilibrium. Moreover, the 
externalities are not of equal magnitude. If the majority of searchers 
are type i workers, then, in general, the type j workers ignore a larger 
external effect than their type t counterparts. Sector X then becomes 
too i-intensive. Making sector X more j-intensive moves the economy 
toward the production possibilities frontier as the economy utilizes 
the search technology more efficiently. As we demonstrate in Section 
IV, it is this externality that, when sector X is relatively small, leads the 
economy to behave in a fundamentally different manner than a fric¬ 
tionless economy. 

III. Hat Calculut 

In this section, we derive the equations of change that relate the prices 
of factors and goods to output quantities. 10 For convenience, we in* 

10 Our comparative static exercises are actually comparisons across steady state* 
When evaluating the effect of a change in a parameter, we ignore the period of conver¬ 
gence to the new steady state. To take into account this period of convergence. [ ” f 
analysis would have to be modified along the lines of Diamond (1980). 
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troduce the following additional notation: L* = L„ + L* = the total 
number of type i workers in sector X; l k = LJLh ~ the A-intensity of 
sector k\ 0 <* = the value share of factor i in industry k (e.g., 0 a> = 
Waylay!Py) "> ® = 0 t a ®oj'i Xj* = LuJLj — the 

physical share of factor i in industry *; X = X,* - X fa = X*, - X a> ; «r* = 
the elasticity of substitution in sector k; 4 >, = e, ! i/e, = the elasticity of 
and 4> = <j> a - <)>* < 0. 

We begin with a brief review of the key equations of the Jones 
model and an explanation of how these relationships are altered by 
the introduction of trading frictions. There are six key equations in 
the Jones model: two factor-market-clearing conditions, two zero- 
profit conditions, and two cost minimization conditions.* 1 In the pres¬ 
ent context, sector Y is modeled in the standard way, consisting of 
perfectly competitive profit-maximizing firms with constant returns 
to scale technology. In addition, factor markets in sector Y are fric¬ 
tionless. Therefore, the zero-profit and cost minimization conditions 
foi this sector are unchanged (eqq. [12] and [2], respectively). The 
factor-market-clearing conditions (eqq. [14]) are also unchanged pro¬ 
vided that factor usage in sector X is taken to include both matched 
and unmatched factors. Any differences in the models must there¬ 
fore be the result of differences in the sector X zero-profit and cost 
minimization conditions. 

The analogue of the sector X zero-profit condition can be derived 
from (IB), the equation that states that workers must be indifferent 
between sectors in equilibrium. We obtain (see result A1 of the Ap¬ 
pendix) 

Px = laxWay IbxWby (15) 

By examining (13) in some detail, we can provide an intuitive inter¬ 
pretation of (15). 12 In the Appendix we demonstrate that (13) may be 
written as a,P x = l^w^, where /« = (L if + L m )IL m . Define ® 1/4 so 
that Tr tt represents the probability that a sector X, type i factor will be 
matched at the end of the period (note that it* ^ e,{s) since the latter 
represents the probability that a factor, initially unmatched, will find a 
match during the period). Next, define w„ * a ,P X so that w a is the 
wage earned in sector X by a type i factor that is matched. Then (13) 
states that, in equilibrium, arbitrage by factors across sectors requires 
the certain return in sector Y to equal the (unconditional) expected 
return in sector X: w v - it+ (1 - ir«) * 0. Equation (15) is 
obtained by solving for w a and summing w*, and w ta . 


Jones actually begins his analysis at a more fundamental level, explicitly pointing 
0411 we dependence of the input requirements on factor prices. The six equations that 
we descrdse can be derived from bis slightly larger system. 

We would like to thank an anonymous referee for this interpretation. 
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Finally, we turn to the analogue of the cost minimization condition 
for sector X. In the standard model, firms choose their mix of inputs 
to minimize unit production cost. In our model, however, the factor 
intensity in the search sector is governed not by a cost minimization 
condition, but by the steady-state conditions (10) and (11). From these 
equations we obtain 


4. 


1 - ijl - e ,) 


( 16 ) 


In Section II we noted that the equilibrium factor intensities in our 
model are not optimal, Therefore, the value of 4* in (16) is not the 
value that minimizes the cost of producing a given amount of X. As 
we will see below, this is the driving force behind our results. 

We are now in a position to derive the equations of change. We first 
develop the relationship between factor rewards and output quan¬ 
tities by rewriting (14), the factor-market-clearing condition, as 


U = 4.X + If, Y. (17) 

Logarithmic differentiation of this condition yields 

Li - + XiyV + kjvc + kiyliy, ( 18) 

where the circumflex denotes the proportionate change in a variable. 

We use the equilibrium conditions to express 4t as a function of the 
changes in sector Y wages. We begin by differentiating (16) to obtain 


~ r)S 

1 - r(l - O' 


(i9: 


Next, we can use equations (3), (6), (9), and (13) to solve for tin 
equilibrium wages as a function of s. Doing so, we obtain - 

«*/*»• 13 Differentiation yields 

S « ( 2 ( 

<P 

Substitution of (20) into (19) yields 4c as a function of ti ay - tit 


15 The relationship states that relative sector Y wages equal relative ex ante empk 
mem probabilities for those initially unemployed. The tatter ratio depends on t 
relative proportions of factors among the unemployed. This dependence makes sei 
in that unemployment in sector X is the relevant alternative for employed sectoi 
workers. On the other hand, relative sector X wages (for matched workers) equal 
ratio of unconditional probabilities of employment: * 4- In this a 

the latter ratio is the sectoral factor intensity, including the employed and the un< 
ployed. The reason for this is that sector X wages are the result of a bargaining prt» 
In this process, the factor that is more abundant in the sector negotiates from a wea 
position because if the match were to dissolve, she would have a more difficult t 
finding a new partner. 
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Similarly, cost minimization in sector Y implies 

/ry — ~ 6jy0y(il>iy ~ tifjy ). (21) 

Finally, substituting (19)—(21) into (18) and setting L t = 0 yields 

V ax% 4" Xoy^ ~ ifiaj ~~ )0ja 4" h-ay&6y&y)> (22) 

Xi*£ + = (a> v - ti) h )(q b - x^e^Oy), (2S) 

where q t = X*<j>,(l - r)/<j>[l - r(l - *,)}. 

The relationship between factor rewards and output quantities is 
now found by subtracting (23) from (22). The resulting expression is 

X(& - t) = (ii> oy - thhyXqa - q b + ty(j y ), (24) 

where t k = O^X^ + 0o*X M > 0. 

Compare (24) with the result implied by equations (lb) and (2b) of 
Jones (1965), reported here (with obvious changes of notation) as 

X<* - Y) = (w a - it i b )(t„tT x + up y ). (25) 

Because there are no factor market frictions in the Jones analysis, w, 
denotes the payment to factor i regardless of sector. 

By inspection, the only difference between (24) and (25) is that q a - 
q b replaces t x a x . However, q a > 0 and q b < 0. Therefore, q a - q b > 0, 
and it follows that the mechanism linking output levels to factor re¬ 
wards in our model is qualitatively identical to the mechanism at work in 
the standard two-sector general equilibrium model. In both cases, the 
qualitative effect of a change in relative outputs on relative factor 
returns depends on the relative physical factor intensities of the sectors 
(i.e., the sign of X). 

We now turn to the derivation of the equation that relates commod¬ 
ity and factor prices in our model. Totally differentiating (15), the 
sector X pricing equation, and using (19) and (20), we obtain 

P X ~ - (tiff, - Why)!, (26) 

where 8« = o< and X = (<J>„ + <fo)(l - r)/<J>[2 - r(2 - e a - e*)J. 
Similarly, we differentiate (12) to obtain an expression for P y : 

P y - + < 27 > 

Finally, subtracting (27) from (26) provides the link between factor 
prices and product prices when factor markets exhibit frictional un¬ 
employment: 

P x - P, - (» - 2)(^ - < 28 ) 

Now, from (3b) and (4b) of Jones (1965), we can derive 

P* - P y * 0(r& a " **)• 


(29) 
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Comparing (28) with (29), we see that the link between commodity 
and factor prices in our model is fundamentally different from that in 
the standard model. In the latter, the qualitative effect of a change in 
commodity prices on factor returns depends on the relative value 
factor intensities (i.e., the sign of 0). As we show below, the X term in 
(28) complicates this link in a manner that has significant implications 
for the shape of the relative supply curve (i.e., the supply-side rela¬ 
tionship between X/Y and PJPy). 

In a frictionless, nondistorted world, increases in the relative price 
of X always cause the supply of X to rise and the supply of Y to fall. 
This is seen by combining (25) and (29) to yield 

* - V ~ (P, - Py) (30) 

In an economy with no distortions or frictions, the two measures of 
factor intensity (in terms of value and physical quantities) have the 
same sign so that A0 > 0. Thus the relative supply curve is always 
upward sloping. 

Combining (24) and (28), we can see the fundamental difference 
caused by factor market frictions: 

X - Y « (P x - P y ) q ° (31) 

It is now clear that the relative supply curve need not be upward 
sloping. Even if we could demonstrate that A0 > 0 (which need not be 
the case since the equilibrium is distorted), it would still be possible to 
have \(0 - X) < 0 and hence a downward-sloping relative supply 
curve. In the next section, we demonstrate that our economy is quite 
regular in the sense that when supply is downward sloping, sector X 
must be relatively small. In addition, by examining the sector X pro¬ 
duction process in detail, we are able to expose the forces at work that 
may lead to such perverse supply responses. 

IV. Interpretation 

We noted in Section ILE that the search sector is generally too asym¬ 
metric. This production inefficiency is the source of the potentially 
perverse supply response derived in Section III. To see this, note that 
the fundamental difference between our search model and the stan¬ 
dard general equilibrium model is the relationship between commod¬ 
ity and factor prices. When factor prices increase, there are two ef¬ 
fects: commodity prices must rise as well if the firms are to continue to 
break even; in addition, firms will economize on the factor whose 
relative price has risen. In the standard model, this second effect is 
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zero by the envelope theorem. The introduction of factor market 
frictions causes this effect to be nonzero and to be important since the 
original equilibrium factor intensities are not optimal. An increase in 
wjw b causes PJPy to rise in the standard model if sector X is relatively 
A-intensive. With frictions, this effect may be offset if the new pro- 
ducdon technique adopted in sector X is more efficient, allowing 
PJPy to actually fall. The purpose of this section is to determine the 
circumstances under which this might occur. 

d. Equilibrium Factor Intensities 

We begin by describing how the equilibrium physical factor intensities 
are determined and how they respond to changes in commodity 
prices. This is accomplished by focusing on the zero-profit condition 
for sector Y and the condition that guarantees indifference of work¬ 
ers across sectors. 

Figure 1 is analogous to a factor price frontier. The YY' curve 
represents combinations of and wty that are consistent with zero 
profits in sector Y. Formally, this curve is implicitly defined by equa- 
tion (12), where 1^ and minimize the cost of producing one unit of 
Y- Its slope is the physical factor intensity of the sector, - ly 

Points along W represent combinations of and consistent 
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with equal expected lifetime incomes across sectors for both types of 
workers. Points above (below) W' generate higher expected lifetime 
income in sector Y (X). To derive this curve, substitute the value of a, 
given in (9) into (6) and then equate V* and V v (see eq. [3]). We then 
obtain 


w n 


e,P* 


2 - r(2 ~ t a ~ e h ) 


for i = a, b. 


(32) 


Given product prices and the proportion of searchers who are of type 
A (and thus e t ), (32) describes one point on W'. As s increases, 
falls, and there is a movement up VV' to the left. We show in result 
A2 of the Appendix that the slope of W is e*[2(l - r)<j> t - 
reM!e a [2{\ - r)<j> a + re*tj>] < 0. 

Since both equilibrium conditions are satisfied at E, this wage vector 
is consistent with diversified production. However, at all other points 
along YEY', sector Y is able to offer high enough wages to draw all 
workers away from sector X and still break even. Therefore, except at 
E, production is specialized to good Y (see n. 9). 

An increase in P x (with P y held constant) makes sector X more 
attractive and causes W' to shift to the right. This curve now in¬ 
tersects YY' twice. 14 Both points of intersection represent a wage vec¬ 
tor consistent with diversified production; yet one point (Z]) repre¬ 
sents an increase in vi^Wb, while the other point (Z 2 ) signifies a 
decrease. Moreover, $ falls (rises) as we move to Z t (Z 2 ) since 
and s are inversely related. 

Once wages have been determined, we can derive the physical fac¬ 
tor intensities. In sector Y, cost minimization requires YJY h - 
This equation defines ly for any given vector of factor prices. There¬ 
fore, an increase in P x causes ly to decrease at Z\ and to increase at Z 2 . 

Next, from (16), l x = [1 - r(l - e a )]r*/[ 1 ~ r(l ~ which is 
increasing in s (result A3 of the Appendix). This implies that an 
increase in P x causes l x to fall at Z\ and to rise at Z 2 . We conclude that 
increasing P x causes both sectors to economize on factor A (B) if we 
are at Z\ (Z 2 ). 


B. Output Responses 

The response of output to changes in relative commodity prices can 
be determined with the aid of figure 2. Suppose that the initial rela¬ 
tive price (P) is such that there is a unique wage vector for sector Y 


M Since W does not necessarily possess any nice curvature properties, there may 
more than one point of tangency with YY' and more than two intersections when r 
rises. We restrict attention to the simplest case in this paper. 
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consistent with diversified production (E in fig. 1). Suppose further 
that for any wage vector sector X is physically more A-intensive than 
sector Y. The factor intensifies at E are depicted in figure 2. Finally, 
suppose that the economy’s factor endowment is represented by l, 
which lies in the diversification cone. 19 We begin by noting that a drop 
in the relative price of X causes the economy to specialize in the 
production of Y. This follows since the fall in P causes W to shift 
down so that it does not intersect or touch YY'. In this situation, all 
firms produce Y since they can easily afford to pay the wages neces¬ 
sary to attract sector X workers. Therefore, P is the lowest relative 
price consistent with diversified production. 

Now, suppose that P rises. This leads to two possible wage vectors 
for sector Y. If Z\ represents the new equilibrium, both sectors be¬ 
come more B-intensive, causing the diversification cone to rotate 
counterclockwise. This implies that the demand for type B workers 

18 It i», of course, possible that 1 lies outside of the diversification cone when we arc at 
£. Iff > 4 > ^ when W and YY' are just tangent, the economy is specialized to the 
produaion of good X. Increasing the relative price of good X will ultimately result in 
diversified production but with a negatively sloped relative supply curve throughout. If 
‘ < ^ < K when W and YY' are just tangent, the economy is specialized to the 
production of Y. Increasing the relative price of good X will ultimately result in 
diversified production and a positively slopm supply curve throughout. 
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increases while the supply remains the same. To bring the demand 
for labor back into line, production much shift to the A-intensive 
good. Thus the production of X relative to Y rises. 

This is not the case, however, if the new wages are given by Z s . [ n 
this case, as P rises both sectors become more A-intensive. Equilib¬ 
rium in the labor market therefore requires expansion of the sector 
that is relatively B-intensive. Thus the production of X falls as its 
relative price rises. Had we assumed that sector X was B-intensive, the 
result would have been reversed, with the type l\ equilibria leading to 
the perverse supply responses. 

We are now in a posidon to relate this discussion to the algebra of 
Section III. At E of figure 1 , YY' and VV' are tangent. If we equate 
the slopes derived above and use the fact that in equilibrium t h k a - 
we can & how that, at £, X = 6 (result A4 of the Appendix). In 
fact, whenever W' is steeper (flatter) than YY ’, it can be shown that l 
< (>) 0. Therefore, if sector X is A-intensive (X > 0), then the type Z, 
equilibria are well behaved since X(0 - X) > 0. At Z 2 , on the other 
hand, X(0 - X) < 0, implying that the supply response is perverse. 


C. Search Sector Production Functions 

To gain further insight into the forces behind our results, we devote 
this subsection to an analysis of the sector X production process. We 
do so because the fact that production in this sector is not technologi¬ 
cally efficient leads to the downward-sloping portion of the supply 
curve. 

The total number of type i workers in this sector at the end of the 
matching process is L a - l^X. Substitution from (16) yields 

X = ^--. (33 

1 — Kl — ti) 


By varying s, we can obtain all combinadons of Lax and that prc 
duce X units of the search sector good. That is, we can obtain ih 
search sector isoquant. By solving the L„ expression for s, substitute 
into the L&* expression, and differentiating, we can (Attain the slope < 
this isoquant. Straightforward calculations yield dL^ldLax *= eM e t ( 

< 0. To guarantee that the isoquants are convex, we assume that ft 

< 0 . 

Production efficiency requires that workers be distributed am 
sectors such that the marginal rates of technical substitution are t 
same in each sector. The slope of the sector Y isoquant is given 
- (YJYb). Therefore, production efficiency requires that 


a 


y a , 'M 

Yt et* 


(■ 
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This condition will be met by the market onlv if, in equilibrium, the 
search sector is perfectly symmetric (s = '/a).* 6 To see this, note that 
we have already shown that in equilibrium YJY b = w^lw^ = eje b . The 
first equality follows from cost minimization in sector Y while the 
second follows from the worker indifference condition. Substituting 
into (34) yields 


(Y = -^ + 

e b 


(35) 


where the superscript e denotes that we are evaluating the term at 
equilibrium values. Since the search technology is symmetric, Cl* = 0 
if and only if s * Va. If s ^ V 2 , then it is always possible to increase the 
production of Y while holding X constant by making sector X more 
symmetric (see Davidson et al. [1987] for details). 

To see why this is important, return to figure 1. Note that as P x 
rises, s falls in a type Zj equilibrium and rises in a type Z 2 equilibrium. 
Unless s is Vi at point E, this implies that increases in P x enhance the 
efficiency of sector X production in one type of equilibrium and ham¬ 
per it in the other. 

Suppose, for example, that sector X is physically more A-intensive 
for all wage vectors so that there are no factor intensity reversals. 
When this is true, s must be greater than Vt at E (result A5 in the 
Appendix). Now suppose that we move to Z\. As we do so, PJP y and 
Waylwby both rise as in the traditional model. The increase in w^lw^ 
causes P x to rise relative to P v since X is relatively A-intensive. This 
traditional effect is captured by 0. At the same time, however, s falls 
and approaches V 2 , which implies that X is being produced more 
efficiently. This effect, captured by X, puts downward pressure on 
PJP r AtZi we know from above that the traditional effect outweighs 
the efficiency effect (since 0 > X). Hence, the mechanism linking 
movements in relative commodity prices and factor prices is qualita¬ 
tively identical to the mechanism at work in the two-sector Walrasian 
model. 

Now suppose that we move from point E to Z 2 . As we do so, PJP y 
rises and w^/wty falls. The reduction in relative factor prices puts 
downward pressure on the relative price of good X since it is relatively 
A-intensive. As before, this traditional effect is captured by the 0 term 
* n (31). However, as we move toward Z 2 , s rises and moves further 
from Vi. This implies that X is being produced less efficiently and puts 


The result that the optimal value of 1 is V* is an artifact of the 

in the matching function. If we let r* denote the value of s that solves (34 >— 
- 0—then it can be shown that, in general, unique. Outresult 
***®T gwerafiae to nomymroetric matching functions if we then define **> 
"fa*" to mean “too far from s In additkm, throughout theremamder ofthe**^; 
Phrase* such ms “1 moves toward VS" would need to be replaced by s moves toward j*. 
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upward pressure on P K . This effect is captured by X, and at Z% it is this 
effect that dominates. Therefore, commodity and factor prices are no 
longer linked in the traditional manner, and it is this result that leads 
to the perverse supply responses. Again, had we assumed that X was 
relatively B-intensive (X. < 0) for all factor prices, then a similar argu¬ 
ment could have been used to show that the perverse supply re¬ 
sponses occur at type Zj equilibria. 

Finally, we note that the perverse supply responses occur when the 
search sector is relatively small. It is as if the efficiency effect is subject 
to diminishing returns as the sector grows. Thus X dominates 6 only 
when sector X is small. This is somewhat surprising since the only 
difference between this model and the Walrasian model is the inclu¬ 
sion of the trading frictions in one sector. One would therefore expect 
that if this sector were small, the model would continue to possess 
properties similar to those in the standard model. However, this is not 
the case. 17 


V. Applications 

The formulation and analysis of a general equilibrium model with 
frictional unemployment serve two purposes. First, because we have 
been able to exposit the model in a familiar framework, one can easily 
examine a host of traditional issues in a framework that is somewhat 
more secure against the charge of unrealism to which any standard, 
full-employment Walrasian model is subject. It seems appropriate to 
address such questions within a model as similar as possible to those 
that economists have used in the past, for in this way one facilitates 
comparison with existing work and builds on earlier understanding. 
Second, by expanding the standard model in this manner we can 
examine questions that bear directly on issues surrounding unem¬ 
ployment. The examples below illustrate the usefulness of this frame¬ 
work. 


A. Tax Incidence 

In this subsection, we consider the impact of a partial factor tax on the 
earnings of type A individuals in sector X. Let Tw = 1 4- /<«, where 4* 

17 It is well known that perverse supply responses can occur when factor markets arc 
distorted (see, e.g., Jones 1971, Magee 1976), Since the search sector production pro¬ 
cess in our economy is distorted, one might suspect that our results are caused by a 
factor market distortion. However, this is not the case. With only factor market distor 
tions, reversal of the physical and value measures of factor intensity is both a necessary 
and a sufficient condition for generating perverse supply responses; this is simply not 
the case when frictions are introduced. In our working paper, we demonstrated that at 
E, the point at which the perverse supply responses begin, V0 > 0. For elaboration on 
this point see Davidson et al. (1986). 
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TABLE 1 


* > Vi 

s < Vi 

X>0 

X2>0 

X2< 0 

X<0 

X2<0 

X2 > 0 


denotes the proportional tax rate. We assume that our economy is 
originally tax free and that all tax revenue is refunded in a lump-sum 
fashion to consumers. 

The first step in the analysis is to add a demand side to the model. 
For simplicity, we assume that preferences are homotheuc. In this 
case, it is well known that the demand side is represented by 

* - * = - fij), (36) 

where a d represents the elasticity of substitution in consumption. 

Incorporating the partial factor tax into the supply side of our 
model yields the following basic equations (see result A6 in the Ap¬ 
pendix): 

\(k - t) = (Way - W h )(q a - q b + tyCTy) - (q a - q h )taa, (37) 

P x ~ Py - (0 ~ Way ~ *>by) + («„ ~ 2)f«. (38) 

The algebraic expression for the incidence of T m is 

V(ti)a> ~ W h ) = [~(q a - q b ) ~ <T d ka a + 2Xa rf ]7' ox , (39) 

where a * X(8 - 2)ct^ + (q a - q b + tycr y ) corresponds to Jones’s 
aggregate elasticity of substitution. We assume an adjustment process 
that directs factors to move between sectors in order to equalize re¬ 
turns. With this assumption, or must be positive in order to guarantee 
local stability. 

The incidence expression in (39) has three terms on the right-hand 
side. The first two are the usual factor substitution and output effects, 
but the last term captures the effect of the tax on relative wages 
through changes in the efficiency of the search process. This effect 
depends on the sign of X2, which in tum depends on both the physi¬ 
cal factor intensities and s, the mix of searchers. To see this, note that 
from the definition of 2 we have sign(2) = sign(r - '/*). The pos¬ 
sibilities are summarized in table 1. 

Regardless of the factor intensities, the initial impact of the tax 
causes sector X to contract. If sector X is relatively A-intensive (X > 0), 
it contracts by adopting more A-intensive techniques, thereby increas¬ 
ing s. This increase in s enhances search efficiency only if the original s 
is less than Yt. If this is the case, the improved efficiency puts down¬ 
ward pressure on PJP y , and the resulting exit of factors further aug¬ 
ments the relative decline in returns to the factor intensively em- 
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ployed in the sector. Consequently, falls further relative to in 
this case. If, on the other hand, s initially exceeds Va t the contraction of 
sector X is offset by the decline in search efficiency. The case in which 
sector X is B-intensive is analogous. 

In addition to its impact on wages paid in sector Y, the factor tax 
can also be expected to change the ratio of expected lifetime income if 
currendy matched to that if currently searching. From (6), (7), and (9) 
we have 


v*. _ i - (l - <0(1 - *,) 

----. (40) 

Differentiation of (40) yields 

(~-y «» 

The effect of the tax on the relative welfare of the employed depends 
solely on its effect on the composition of the searchers. For example, 
if the tax increases s, type A searchers have a more difficult time 
finding a match, and thus the relative welfare of their employed 
counterparts rises. 

Intuitively, the tax affects s in two ways. First, it causes sector X 
to contract. If X is relatively A-intensive, it must become more 
B-intensive as it contracts so that the conditions in (14) continue to 
hold. As X becomes more B-intensive, r fails. In addition to this out¬ 
put effect, there is a substitution effect since sector X becomes rela¬ 
tively less attractive to type A workers. In this case, the substitution 
effect reinforces the output effect, and s falls. If sector X is relatively 
B-intensive, the output effect would be reversed and would tend to 
offset the substitution effect. 18 


B. The Protection of Jobs 

Calls for protection of domestic production against foreign competi¬ 
tion are strongest when economic opportunities are lowest among a 
segment of the population. Wages and unemployment rates are two 
dimensions of these economic opportunities, and the question arises 
as to the effectiveness of, for example, tariff barriers in improving 
wages and reducing unemployment rates. 

To address the consequences of protection, assume that sector X is 
relatively A-intensive, that the initial equilibrium is stable, and that X 
is imported. Now impose a small tariff that increases the relative 

18 Using (41) and result A6, we obtain oS ■ (!/$)(-AcrA, + The first term in 

parentheses on the right-hand side represents the output effect, while the second term 
reflects the substitution effect. 1 
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domestic price of X. The first issue is the effect on the relative well¬ 
being of the protected factor, type A individuals, measured as VJV k , 
which in equilibrium equals relative wages. In essence, this is the 
standard Stolper-Samuelson discussion extended to the case of fric¬ 
tion-ridden factor markets. 

We can see from equadon (28) that the impact of a change in 
relative prices on relative wages depends on the sign of (0 - X), which 
(because X > 0) has the same sign as the slope of the reladve supply 
curve. Thus if the initial equilibrium is on the upward-sloping portion 
of the relative supply curve (where 0 > X), small tariffs enhance the 
relative well-being of unemployed type A individuals. On the other 
hand, 0 < X, which occurs only when sector X is relatively small, 
implies a decline in the relative wage of the protected type. In gen¬ 
eral, the Stolper-Samuelson relationship depends on the mechanism 
linking relative product and relative factor prices. Along the down¬ 
ward-sloping part of the relative supply curve this relationship does 
not correpond to that implied by relative factor intensities. 

In addition to its influence on wages, the tariff also affects the rate 
of unemployment, = (. L „ + /.*,)/(£„ + L*). From (10) and (11), we 
obtain 


Logarithmic differentiation of (42) reveals that 

A - _( \, + 

\e a + e h - 2<vW 


(42) 


(43) 


The coefficient of i in (43) is monotonically increasing in s. Further¬ 
more, the coefficient is negative, zero, or positive as s is less than, 
equal to, or greater than ‘/a. Therefore, the unemployment rate varies 
inversely with the symmetry of sector X and directly with its size. 
There are two opposing effects on the unemployment rate. First, as s 
moves toward Vi, the sector grows more symmetric and unemploy¬ 
ment per unit of X declines. On the other hand, the sector itself 
increases in size, bringing more unemployment. We derive the effect 
of any tariff by noting that it shifts W away from the origin in figure 
1- If the initial equilibrium is Z\ (the relative supply slopes upward), 
then sector X expands and s falls, if, in free trade, s < Vt, then 
unemployment rises unambiguously. Extension to the case involving 
z s and s > Vi is straightforward. 


Minimum \Vages 

The impact of minimum wage laws has been studied in a general 
equilibrium context by several authors including Johnson (1969) and 
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Mincer (1976). Both authors emphasize that induced changes in the 
output mix due to minimum wages can significandy compromise par¬ 
tial equilibrium results. Our exposition of the simple general equilib¬ 
rium model with frictional unemployment can address this issue in a 
context in which unemployment arises naturally and exists indepen- 
dendy of the minimum wage. As we show, this feature also permits a 
useful treatment of the issue of voluntary and involuntary unemploy¬ 
ment. 

Consider the incidence of a global minimum wage. Again assume 
that sector X is reladvely A-intensive in the physical sense. Let V serve 
as the numeraire and assume that the initial, unconstrained equilib¬ 
rium is on the upward-sloping portion of the relative supply curve as 
diagrammed in figure 3. 

Figure 4 presents an initial supply-side equilibrium at Z\. Notice 
that type B workers are the low-wage earners; thus a universally ap¬ 
plicable minimum wage increases w^, for this is the lowest wage paid 
in the economy. Let w represent the minimum wage. The wage for 
type A workers that permits zero profits in sector Y is then but the 
vector (w, w^) creates incendves for workers to emigrate to sector X, 
which gives higher expected lifetime income. As workers move to 
sector X, the output of X increases and its demand price falls, shifting 
W toward the origin until factor market equilibrium is restored at 
Z{. This is not an equilibrium for the economy, however, because at l\ 
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the supply price falls short of the marginal willingness to pay of de- 
manders. To see this, note that there is a price below P x in figure 3 
that would dear factor markets at the wages indicated by Z{. At this 
price the minimum wage would not be binding; yet = w. We 
denote this price by P? in figure 3 and observe that the product 
market is not in equilibrium. 

Since the relative demand price for X exceeds the relative supply 
price, workers attempt to move to sector X, but because sector Y must 
release factors in a constant, relatively B-intensive proportion (to 
maintain the wages at Z\ in fig. 4), the proportion of type A workers in 
sector X falls. To determine the effect of this migration on price, we 
recall the worker indifference condition. Equation (13) holds for the 
unconstrained type A workers, but the analogous condition for type 
B workers is now a= w. Thus solves c y (w' ay , w) = 1. Substituting 
from (6), (9), and (13) yields 


, *JP 

Way ~ 2 - r<2 


(44) 


With w'aj give©, the decline in s must be compensated by a fall in P x to 
leave type A workers indifferent between sectors. In figure 3, we 
show PJPy falling as XIY increases toward the ultimate equilibrium at 
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T. This fall in the relative price of X redounds on the factor markets 
by shifting W in figure 4 in further. The equilibrium wages remain 
at Z{, where YY' intersects the horizontal portion of W. Type A 
workers are indifferent between sectors, but type B workers prefer 
sector Y. The minimum wage prevents these unemployed type B’s 
from underbidding their counterparts employed in sector Y. 

Without a minimum wage, there are four “classes” in the economy: 
type i searchers and their employed counterparts in sector Y, who 
earn equal expected income, and type i matched workers in sector X, 
who are unambiguously better off. The minimum wage creates a new 
lower class: those type B searchers who, although they strictly prefer 
employment in either sector, cannot find a job. The minimum wage is 
designed to improve the welfare of low-wage workers (it does raise the 
well-being of those who retain sector Y jobs); yet it condemns those 
unemployed to longer durations of unemployment (<4 declines with 
the decrease in 5 ) and lower earnings when they do find sector X jobs 
(a* shrinks as the bargaining strength of type B workers diminishes). 

This restriction on low wages has implications for the issue of 
voluntary and involuntary unemployment. In the unconstrained 
equilibrium, one can reasonably argue that all observed unemploy¬ 
ment is voluntary. If an unemployed worker were given the choice of 
taking a job in sector Y or continuing to search for a job in sector X. 
she would be indifferent. With the minimum wage, type A workers 
continue to maintain indifference across sectors and thus can be con¬ 
strued to choose their unemployment voluntarily. On the other hand, 
there is an excess supply of type B workers to sector Y. Those who 
cannot find employment at w are forced to seek employment in sector 
X. If asked, a type B worker who is searching in sector X would 
stricdy prefer to work in sector Y at the going wage, but no jobs are 
available. All unemployed type B workers are, in a very real sense, 
involuntarily unemployed. 

YI. Conclusion 

There is no doubt that over the years the use of the simple two-sector 
general equilibrium model has led to many valuable insights. Not¬ 
withstanding its wide applicability, many issues require the develop¬ 
ment of new models. For example, Diamond’s general equilibrium 
search models (1984a, 1984 b) provide us with a framework in which 
to consider traditional macroeconomic questions (e.g., the neutrality 
of money and the effects of government policies on business cycles) m 
a microeconomic framework. Our intent in this paper is to reconcile 
these new models with the existing body of literature concerning 
simple general equilibrium models. This approach facilitates com ' 
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parison and discerns how much of the earlier intuition is preserved. 
This gives us some idea as to what extent our analyses and conclusions 
need to be modified. Toward this end, we have provided a model that 
is very much in the spirit of the simple two-sector general equilibrium 
model yet allows for frictional unemployment. We then analyzed its 
structure in a manner that encourages comparison with Jones (1965). 
We demonstrated that while much of the structure remains, the addi¬ 
tion of frictions in the labor market alters the basic relationship link¬ 
ing factor and commodity prices. This new relationship admits the 
possibility of downward-sloping relative supply curves, especially 
when the search sector is small. We then used our model to reexamine 
three old questions: the incidence of taxation, the effects of protec¬ 
tion, and the impact of minimum wage laws. We demonstrated that 
our synthesis can yield new insights in each case. 


Appendix 


Result A1 

Begin by using (3), (6), (13), and (16) to obtain 

„ \l-ri 

a.P* = I- - -K = k«V 

Now sum ot a P x and a bP t to obtain (15). 


Result A2 

Two equations define the VV' curve: V 9 = V„ for i — a, b. From (6) and (3) 
these equations imply 

«P, 

2 ~ r{2 - e a - e„) ’ 

Using the w*y expression we could, in principle, solve for r as a function of w„. 
Substituting this function into the Wj, expression then yields the TV" curve. 
Define Gfiti*,, s( w v )) * 0 to be the equation for VV". Totally differentiating G 
we obtain 


dw^ (dG/dsKBs/dwiy) 

duioy dGIdw^ 

Now use the implicit function theorem to obtain dr/dfrom the expres¬ 
sion above and substitute to obtain 

iui h = cj[2(l - r) + <r,r] - 

= «1[2{1 - r) + #*r) - t' h e.r 

Straightforward algebra can now be used to obtain the desired expression. 
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Result A3 

We wish to show that 4 is an increasing function of s. We have 

, [i - 

k = n - »<i - «»)]/«»' 

The derivative of the numerator of this expression is 

- fl ~ K1 ~ *M _ 4(1 - r) ^ A 

4 4 

By symmetry, the denominator is decreasing in s. Therefore, 4 is increasing 
in s. 


Result A4 

We wish to show that, at point E, X = 0. 

At point E the slope of IT" is - 4 = - (4/W and the slope of FV ’ is e*[2( ] 
- r)4>* - re a «|>]/e a [2(l - r)<j> a + re^]. Equate tnese values, multiply both sides 
of the equation by eje h , and substitute w oy lwsy for eje b to obtain 

_ 6^ = 2(1 - r)4> t ~ re a 4> 

1 - 0 B , 2(1 - r)d> 0 + re b $ 

Solving for 0^, yields 

A = ~ 2(1 - r)4» t 

* 4>[2-r(2 -e a -ei>\ 

Now 6 * a* - 0aj and a„ is given in (9). Combining the two yields X. 
Result A5 

We wish to demonstrate that s>V* at point £ if and only if A > 0. Now begin 
by noting that A = 4 “ 4* w ^ ere 4 i* P ven ‘ n result AS and, at point £, ~4 < s 
equal to the slope of the W curve, which is given in result A4 above. Sub¬ 
stituting these values into A and subtracting yields 

sign(A) = sign[-(<j» a + 4>»)]. 

To sign 4> e + simply use our assumption that e,“(s) < 0. Finally, note tha 
by symmetry of the search technology, <j>« + 4>* = 0 when s = Vs. 


Result A6 

We define T a such that the net income of a matched worker is w* “ (oj p,yr, 
The expected consumption resulting from a match is a,PJ(l - r)T a . Thu 
from (4) and (5), we have the surplus due to a match: 

(1 - d)( 1 - d - r) 

(1 - r)T« 1 - r 

Since only type A income is taxed (7*, = 0), the Nash bargaining soluti' 
does not split the surplus evenly. The solution requires maximization of * 
product of the surpluses. The first-order conditions to this problem ut>| 
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aj( 1 - a,,) = [1 - r(l - «.)]/[ 1 - r(l - e*)]. Substituting into (13) and 
dividing the equation for i « a by that for t = b yields eje h = Wa,TJw h . 
Differentiating, we obtain <J»f = + t„. Repeating the derivations in 

(21)—(29) yields (37) and (38). 
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This paper hypothesizes that value of time, and consequently labor 
force participation, can vary with circumstances specific to a mar¬ 
riage or a marriage market. Wives* traits valued in the marriage 
market are expected to be associated with lower labor force partici¬ 
pation, whereas husbands’ traits valued in the marriage market are 
expected to be associated with lower participation rates on the part 
of wives. Evidence for these hypotheses is found on the basis of 
regressions of labor force participation for a sample of Israeli mar¬ 
ried women. Inclusion of traits valued in the marriage market and 
marital sorting patterns increases the explanatory power of the re¬ 
gressions. 


I. Theoretical Background 

The decision to participate in the labor force varies directly with wage 
opportunities expected in the labor market and inversely with the 
value of time in the home. In a general way the decision to enter the 
labor force can thus be modeled as a function g(w, u>*), where w is 
the expected wage and w* is the value of time in the home. The value 
of time is usually considered a function of household characteristics 
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such as marital status and presence of young children (see, e.g., Min¬ 
cer 1962; Becker 1965; Heckman 1974). This paper pursues the idea 
stated in Becker (1973) that the share of household income going to a 
particular spouse, and consequendy that spouse's value of time, can 
vary with circumstances specific to a marriage or a marriage market. 
It follows that any trait of the wife or husband associated with a 
higher wife’s share of household income also implies a higher value of 
time and therefore a lower likelihood that she participates in the labor 
force. 

Formally, w* is viewed as a function w* = k • I, where w* is the value 
of time of a spouse, I is a vector of household income sources other 
than that spouse’s income from work, and k is the proportion of such 
income that spouse obtains for her own benefit. We are assuming that 
spouses’ well-being depends on the extent to which they control the 
household’s income. In turn, this implies that spouses purchase at 
least some private (as opposed to public) goods and that they do not 
get as much utility out of their spouse’s consumption as they get out of 
their own. 

Proportion k of the household’s income obtained by one spouse is 
established as a result of marriage market forces and internal bargain¬ 
ing between husband and wife (see Becker 1973; Grossbard- 
Shechtman 1984). This is represented as k = k(\, f , V, m ), where V are 
vectors of traits, and subscript i stands for individual traits such as age 
or education,/for wife and m for husband. As stated here, function k 
is very general. It could possibly include multiplicative terms such as 
the product of wife’s and husband’s age or ethnicity or differences 
between wife’s and husband’s age. In any case, it is hypothesized that 
the relative traits of a wife in comparison with those of her husband 
influence the strength of her bargaining power k. For example, if she 
is relatively well endowed in a trait lacked by her husband, the value 
of h would be raised and therefore the value of her time. 

Turning this around, we gel the hypothesis of compensating differ¬ 
entials in marriage. A husband with traits that are relatively undesir¬ 
able in comparison with his wife’s traits has to compensate her materi¬ 
ally by letting her have a larger proportion k of his income or of some 
Joint income. When such compensating differentials in marriage oc¬ 
cur, wives’ material needs are more likely to be satisfied by marriage 
and married women are less likely to enter the labor force. The hy¬ 
pothesis thus states that the presence of compensating differentials in 
marriage is likely to discourage a married woman's labor force partici¬ 
pation (hence called the “compensating differentials hypothesis”). 1 

Another application of compensating differentials in marriage can be found in 

CroMbard-Shechtman (1983). 
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The extent to which a trait is likely to affect proportion k and, 
therefore, a woman’s value of time depends not only on the tastes of 
this household but also on preferences in the marriage market in 
general. If a trait t is generally considered as attractive, the holder of 
such a trait is more likely to translate this into bargaining power k than 
if such a trait is specific to the marriage and therefore attractive only 
to the spouses. Duration of marriage may be a factor influencing k 
since it is positively related to marriage specificity and might reduce 
the (general) market value of a person’s traits. 

In our empirical tests we control for ethnicity: European-American 
(Western) versus Asian-African (non-Western) in the context of Is¬ 
raeli Jews. Being Western is presumably an asset in both the marriage 
market and the labor market, so the wife’s ethnicity is expected to 
raise both w and w* and therefore has an ambiguous impact on labor 
force participation. We can expect compensating differentials on the 
part of non-Western husbands married to Western wives and there¬ 
fore lower participation in the labor force on the part of intermarried 
Western wives. However, a trait such as ethnicity is problematic since 
it might mean different things to different people, and therefore 
the k function may show discontinuities. For instance. Western Jew; 
might discriminate against marriage to non-Westem men and women 
while simultaneously non-Western Jews might discriminate agains 
Western Jews. Consequently, we do not have dear predictions regarc 
ing ethnicity but include the ethnic variables for control purpose 

In the following empirical section we report an empirical test of tl 
compensating differentials hypothesis using a number of traits 
husband and wife. We test for a higher participation rate for worn* 
with lower k and, consequently, lower w*. Lower values of time cou 
result from a woman’s deprived social background or recent arrival 
Israel. We also test for lower participation rates for women married 
men considerably older than themselves. Such men are expec 
to give their wives compensadng differentials, and consequendy 
expected to be high and labor force participation low. 

The predictions derived from the compensating differentials 
pothesis differ from.those one could infer from an alternative th< 
based on the relationship between mismatches and divorce proba 
ties. According to such a theory the more a couple is mismatched 
example because the husband is much older than the wife, the hi 
the probability of divorce and the more the wife is likely to ente 
labor force. 

II. Empirical Study 

This theory of labor supply and marital choice was tested usinf 
«he mobility survey conducted by Israel’s Bureau of the C 
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in 1974, the only version of the annual labor force survey that in¬ 
cludes information on the fathers of husbands and wives that were 
interviewed. The hypotheses stated above assume that the more in¬ 
come derived from marriage, the less married women choose to work. 
That would be true only if financial considerations play a major role 
in women’s decision to work. Women who enjoy working outside the 
home on the basis of work’s intrinsic rewards tend to be more 
educated and often do not work full-time. Therefore, to capture 
women driven primarily by work’s financial rewards, our study was 
restricted to women who had not graduated from high school, and we 
defined our dependent variable as full-time participation in the labor 
force. As is apparent from table 1,12.8 percent of all married women 
in the sample worked full-time. The independent variables defined in 
table 1 include variables that have been included in previous empir¬ 
ical estimations of labor supply—wife’s age, earning potential, years 
of schooling, husband’s schooling and income, number of children, 
and years of residence—and innovative independent variables such 
as father’s occupation low, husband-wife ethnicity combinations, and 
husband-wife age combinations (husband older). 

Models of full-time labor force participation were estimated using 
the logit method of estimation. The first model we estimated (regres¬ 
sion 1 in table 2) is one commonly found in the literature. 2 It can be 
seen that once the effect of education on potential earnings is cap¬ 
tured, years of schooling has no impact on full-time labor supply. 
This could reflect a nonlinear effect of schooling on individual success 
in the marriage market. 3 We find that women who have resided more 
years in Israel are less likely to work outside the home. 4 In the light of 
the present theory, years of residence could be interpreted as a desir¬ 
able characteristic that increases a woman’s marriage opportunities, 
thereby raising her w* and reducing her need for income from out¬ 
side work. This effect does not seem related to earning potential and 
discrimination by employers since residence was used as a determi¬ 
nant of potential wife’s earnings. 

Regression 2 in table 2 includes variables that are not commonly 
included in studies of labor supply: a dummy capturing the wife’s 
father’s low occupational status and a variety of combinations of hus¬ 
band’s and wife’s characteristics. Such variables reflecting marital 


Mo»t previous studies, including Gronau's (1981) study using Israeli data, indude 
number and age of children in regressions of married women’s labor force participa- 
non. We followed most previous literature in ignoring the fact that fertility and labor 
torce participation may be simultaneously determined. 

; See Grossbasrd (1976,1980) and Grossbaid-Shechtman (1982) for discussions of the 
Klrooling on indicators of w*. 

j . Gronau (1981) also found lower participation rates for women having resided 
> ot >ger in Israel, but he has no explanation for this. 



TABLE 1 

Means and Definitions of the Variables 


Variable Mean Definition 


Wife’s characteristics: 


Worked full-time 

12.8% 

Age 

Young 

39.8 

21.7% 

Years in Israel (residence) 
Earnings (In) 

22.7 

1.57 

Father’s occupation low 

47.7% 

Schooling 

7.1 

Hushand's characteristics: 
Schooling 

Income (In) 

8.3 

9.6 

Wife-husband combinations: 
Wife AA * husband AA 

Wife AA * husband EA 

Wife EA * husband AA 

Wife EA • husband EA 

51.5% 

5.8% 

3.8% 

38.9% 

Husband older 

4.69 

Optimal age difference 

.40 


Children (number): 


Age 0-4 

,56 

Age 5-13 

.90 

Age 14-17 

.40 


Worked at least 40 hours in the 
last week 

Years 

Dummy = 1 if age between 22 
and 29 

Predicted natural logarithm of hourly 
wage based on subsidiary regres¬ 
sion* 

Father engaged in occupation ranked 
30 or less according to Tyree’s 
(1981) prestige score (maxi¬ 
mum: 87) 

Years (maximum: 11 years by defini¬ 
tion) 

Years 

Includes regular and supplemental 
yearly income of the husband 

AA = born in Asia (except Israel) or 
Africa or born in Israel and father 
born in other pans of Asia or Af¬ 
rica (non-Western); EA = born in 
Europe or America or bom in Is¬ 
rael of father born in Europe, 
America, or Israel (Western) 

Difference between husband’s age 
and wife’s age if that difference is 
larger than 3 years 

Dummy * 1 if the difference be¬ 
tween husband’s and wife’s age 
equals 3 or less (includes 10 cases 
in which wife is 3 or more years 
older) 

Number of children aged 4 or 
younger 

Number of children between ages 5 
and 13 

Number of children between ages I' 
and 17 


• The regression's equation i> 

In earning! " -.19 4 .)44»chooUng + .051 experience 4 .0005<experiencc)* 
(3.80) <8.32) 

- .002experiencc * schooling 4 .01 IBretidence. 

(3.39) (5.80) 

N » 394; R* » .4; t-statistict are in parentheses. The results are similar to those in Gronau (1981). 
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TABLE 2 


Locrr Regressions of Wife’* Labor Force Participation 


Variable 

Regression 1 

Regression 2 

Wife'* characteristics: 

Age 

-.21 (1.28) 

-.25 

(1.45) 

Age squared 

.002 (1.09) 

.002 

(1.29) 

Schooling 

-.25 (1.19) 

-.29 

(134) 

Years in Israel 

-.11* (2.55) 

-.12* 

(2.68) 

Father low occupation 


.55* 

(2.06) 

Earnings (In) 

7.36* (2.11) 

8.31* 

(2.28) 

Asian-African (AA) 

-.67 (1.54) 



Husband’s characteristics: 

Schooling 

.01 (.30) 

0 

0 

Income (In) 

- .08 (.26) 

-.66 

(153) 

Asian-African (AA, non-Westem) 

.45 (1.01) 



Wife-husband combinations: 

Wife AA * husband AA 


- 10.29** 

(169) 

Wife AA * husband EA 

, . . 

-19.20 

(1.58) 

Wife EA * husband AA 

. . • 

-4.30 

(.35) 

Wife AA * husband AA * husband 
income 


.99 

(1.57) 

Wife AA * husband EA * husband 
income 


1.87 

(1.54) 

Wife EA * husband AA * husband 
income 


.61 

(.48) 

Husband older 


-.12** 

(1.77) 

Wife AA * husband AA * husband 
older 


.12** 

(1.85) 

Wife AA * husband EA * husband 
older 


.25* 

(213) 

Wife EA * husband AA * husband 
older 


-.07 

(.47) 

Optimal age difference 

. v . 

-.60 

(1-26) 

Children: 

Age 0-4 

-.68* (2.73) 

-.79* 

(2.97) 

Age 5-13 

-.25 (2.55) 

- .28** 

<165) 

Age 14-17 

-.10 (.42) 

-.08 

(.33) 

Constant 

-3.06 

2.6 


-2(logL, - log L 3 )‘ 

43.43 

67.77 


Predictive accuracy 

.499 

.526 


Number of observations 

635 

635 



Non.—See table I for explaoatiom of vara Met. Atymptotk ft are in paremhetet. .... 

L[ is (he likelihood for the model containing only the intercept (- 2 log likelihood » 48-1 78); £* »the likelihood 
for this particular model. 

*P< .05. 

*P<A0. 

t 

i choice appear to improve the model’s predictive power. A log Iikeli- 
l hood test comparing regressions 1 and 2 shows that the inclusion of 

■ these additional variables significantly improves the model s predic- 
1 tivc Power (chi-squared significant at the 0.5 percent level). Some of 

■ the «gns of the coefficients in regression 2 also seem to confirm our 
: hypotheses. The dummy reflecting the wife’s relatively deprived 
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background (her father was employed in an occupation of low pres¬ 
tige and generally low income), which can be viewed as a nondesirable 
trait in the marriage market, is found to be significantly positive. 5 

It also appears from regression 2 that if a woman is married to a 
man at least 3 years older than she is, the older he is the less likely she 
is to work full-time. This is consistent with our compensating differ¬ 
entials hypothesis. Older men are relatively penalized in the marriage 
market and have to “buy” themselves the services of a wife by offering 
high pecuniary w* and thus making it unnecessary for women to work 
full-time. 6 

Interestingly, this negative relation between older husband and 
labor supply varies with ethnicity. 7 Asian-African Jewish women are 
less likely to receive compensating differentials from husbands much 
older than they are. By using the approximation rule bp{ 1 - p), where 
b is the regression coefficient and p the probability of participation 
(see Pindyck and Rubinfeld 1981), we found that for European- 
American women each additional year of their husband’s age (beyond 
a 3-year difference) reduces full-time labor force participation by 1 
percent; that is, older husbands seem to “pay” compensating differen¬ 
tials. In contrast, the net effect of older husband among Asian- 
African women married within their own ethnic group is zero, which 
may reflect the fact that the average age dif ference at first marriage 
tends to be much higher than 3 years (the Israeli average) among Jews 
of Asian-African origin. Asian-African Jewish women appear to be 
willing to marry older European-American men without asking for 
any pecuniary compensation at all. In fact, the net effect of older 
husband on the labor supply of Asian-African wives married to Euro¬ 
pean-American husbands is to raise their participation rate 1 percent 
above that of European-American women married within their own 
group. 

It is difficult to find alternative explanations for this older husband 
effect. It has been argued in a Belgian economic study of married 
women’s labor supply that older husbands have higher incomes and 
can therefore afford a housewife (DeWachter 1982). But husband's 

8 The positive sign of wife’s father in low occupation can also be interpreted as » 
negative income effect. We do not have a good measure of nonwage income. The 
father low occupation variable is based on a ranking by social status as well as income, it 
is likely that women whose fathers had been employed in low-status occupations might 
also have lower nonwage income at the time of the survey. 

* Separate regressions in which husband older was measured as the ratio of the age 
difference between husband and wife to the wife’s age showed that it is not the absolute 
difference in age that matters, but that difference in proportion to the wife’s age 
7 The statistical significance of the interaction terms with ethnicity can be questioned- 
Large samples are needed in order to assume asymptotic normality and consistency 
We had only small numbers of couples in which one spouse is Western and the other is 
not, but we had large numbers of people married to spouses of their own ethnicity 
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scho oling and income are included in the regressions. Granted, our 
income measure is current income and may be an imperfect measure 
of permanent income. But that would strengthen our argument. If 
permanent income matters more than current income and men are 
generally in the upward-sloping part of their lifetime earnings 
profile, older men earning the same income as younger men would 
have a lower permanent income, and their wives would be more likely 
to work full-timel Also, it was not the age of the husband that was 
found to matter, but a particular function of the age difference be¬ 
tween husband and wife. A culturally oriented alternative explana¬ 
tion is that older husbands want their wives to fit the stereotype of the 
traditional housewife. Again, that would not explain why we found 
this particular function of the age difference or why it varies by hus¬ 
band-wife ethnic combination. 

Inclusion of variables reflecting marital choice also modifies the 
coefficients of regressors included in traditional models of female 
labor force participation. Some coefficients that were insignificant 
become significant after the inclusion of marital choice variables. For 
example, in regression 2, children are found to deter mother’s labor 
force participation not only if the children are 4 or younger but also if 
they are between ages 5 and 13. 8 

III. Summary and Conclusions 

Husband’s characteristics valued in the marriage market are posi¬ 
tively related to wife’s labor supply through a mechanism of compen¬ 
sating differentials. Women with qualities valued in the marriage 
market are less likely to work outside the home. Biases in the effect of 
husband’s or wife’s characteristics on wife’s labor supply may be 
caused by insufficient control for other characteristics and marital 
sorting patterns. This suggests that female labor supply studies 
should include determinants of success in marriage markets in addi¬ 
tion to the variables that are usually included. 
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The paper considers the question whether observed price differen¬ 
tials reflect perceived differences in quality, service agreements, or 
location or whether information imperfections can explain this phe¬ 
nomenon. It sets out theoretical arguments linking inflation to re¬ 
ductions in the information stock held by agents and thus to greater 
price dispersion. The hypothesis is tested using monthly price data 
for 13 uniquely defined goods sold in Israel between 1971 and 1984. 
Price dispersion is shown to be positively related to the rate of mar¬ 
ket price inflation. Since inflation is an unlikely proxy for changes in 
perceived characteristics, the findings support price dispersion theo¬ 
ries based on "optimally imperfect" decision making. 


Price dispersion is a manifestation—and, indeed, it is 
the measure— of ignorance in the market. [Stigler 
1961, p. 214] 

Casual inspection of almost any consumer market will reveal that the 
price of a good depends on where it is purchased. Indeed, a number 
of careful studies also indicate that prices differ by seller. 1 The in- 


This paper is based on parts of my Ph.D. dissertation (Van Hoomissen 1987a). I am 
grateful for the help and comments of Steve Sheffrin, Joaquim Silvestre, and Louis 
Makowski. Special thanks are due Zvi Adar and Avner Ben-Ner. The cooperation of 
Zipora Remer and Reuven Karshai of the Prices Division and the staff of the Archives 
department at the Israel Central Bureau of Statistics is gratefully acknowledged. 

Stigler (1961) cites price dispersion among 27 Chevrolet dealers (coefficient of price 
variation 1.72 percent) and 14 anthracite coal dealers (coefficient of price variation 6.8 
Percent). Pratt, Wise, and Zeckhauser (1979) document significant price dispersion m 
Boston for 39 products with approximately 12 quotations each, and Carlson and Pes- 
eatrice (1980) document price dispersion for 34 products (between six and 14 quou- 
?° n » each) with coefficients of variation between 3 and 41 percent. Marvel (1976), 
M «hewson (1983), and Dahlby and West (1986) also cite empirical evidence. 
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stinctive reaction of many economists, however, is to explain this phe¬ 
nomenon not as price dispersion, which would imply that the same 
good has different prices at different outlets in the market, but rather 
as price differentials reflecting actual or perceived differences in 
quality, service agreements, or location. Such ideas are firmly embed¬ 
ded in microeconomic theory. 

This paper proposes a basis for rejecting the “multiple characteris¬ 
tics" hypothesis as the sole determinant of price dispersion. While it is 
acknowledged that few goods, no matter how closely defined, do in¬ 
deed represent perfect substitutes, it is nevertheless to be expected 
that differences in location and perceived product quality will cause 
price differentials that are relatively stable over time, varying only 
with changes in the distribution of quality or location through the 
population of firms in a market, or with changes in consumer prefer¬ 
ences. In the absence of notable trends in firm location, markei 
power, consumer preferences, product quality, or service agree¬ 
ments, the stronger the evidence of a nonrandom time trend in price 
dispersion, the more weakly stands the hypothesis that price disper¬ 
sion is due only to the dispersion of tastes or product characteristics. 2 

It is the contention of this paper that price dispersion is strongly 
influenced by the presence of differentially informed consumers. 
This is suggested by equilibrium price dispersion theory, which fol¬ 
lowed the seminal articles of Stigler (1961) and Rothschild (1973). 
Combining firm pricing models and the theory of “optimally imper¬ 
fect” consumer decision making—the theory that when search is 
costly and the return to search declining consumers will normally not 
search exhaustively—different authors have demonstrated how 
profit-maximizing firms, knowing something about consumer search, 
can, in the aggregate, generate a price distribution with more than 
one price for the same good. 3 If it is the case that price dispersion 


* The contention that examination of the time behavior of price dispersion is an 
appropriate test of this hypothesis is supported by Stigler and Kindahl (1970, p. 89): 

If there were a unique price, there would be a unique price change from one 
date to the next. The converse is less simple. It would be possible that there 
was no dispersion of movement even with dispersion of prices, at a given 
time, because the differentials due to transportation costs, quality, lot size, 
etc., were stable. The dispersion in prices due to incomplete knowledge, 
however, would lead to dispersion of price movements- If seller A has a price 
3 percent higheT than B today, and yet makes sales because of incomplete 
price search by buyers, it is unlikely that A and B will change prices simulta¬ 
neously and in identical proportion, and impossible if they have incomplete 
knowledge of all prices. 

* Examples of equilibrium price dispersion models based on the assumption of dif¬ 
ferential information are Axell (1977), Butters (1977), Pratt et al. (1979), Burden an 
Judd (1983), and Rob (1985). A different approach is presented in “sales" model*, sue 
as Varian (1980). 
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accompanies “ignorance,” then it follows that inflation, long believed 
to cause “confusion” about prices (see, e.g., Deaton 1977; Lucas 
1977), should also be accompanied by increased price dispersion. Evi¬ 
dence is provided here that inflation—which is an unlikely proxy for 
changes in perceived characteristics but a likely proxy for changes in 
information—strongly influences the extent of price dispersion. This 
finding supports price dispersion theories based on optimally imper¬ 
fect decision making. 4 

Hypotheses are tested on data for 13 uniquely defined consumer 
goods—whose prices were observed monthly at different outlets in 
Israel—between 1971 and 1984. Since separate data for 13 goods 
exist, several hypotheses about the uniformity of the price dispersion/ 
inflation trade-off are also tested. The paper is organized as follows: 
Section I reviews some of the theoretical underpinnings of price dis¬ 
persion theory and suggests a rationale for consumers to be less in¬ 
formed during an inflationary period than otherwise. Section 11 de¬ 
tails hypotheses to be tested and discusses the data, econometric 
methodology, and empirical results. Section III concludes the paper. 


I. Theoretical Framework 

Why should consumers behave differently when searching for the 
lowest price during an inflationary period than during a period of 
stable prices? Suppose, for example, that during an inflationary inter¬ 
lude firms do not change their prices in lockstep. In response, fully 
rational consumers who purchase the same item repeatedly will “buy" 
less information (which is costly to acquire and maintain) during an 
inflationary period because currently acquired information will have 
diminished future use. That is, because parameters of the price distri¬ 
bution and store rankings change from period to period when there is 
inflation, information will have diminished return and consumers 
who purchase an item repeatedly will find it optimal to remain less 
informed. Consideration of the time behavior of prices adds another 
dimension to the theory of optimally imperfect decisions: if the future 
value of information is reduced by variances in store price changes, 
then the optimal amount of information a consumer will choose to 
hold will be smaller. 5 Because consumers are less informed during 


4 Several authors have shown that inflation and relative price variability are positively 
correlated. See Vining and Elwertowski (1976), Parks (1978), Cukierman (1979), 
Cukierman and Leiderman (1981), and Sellekaerts and Sellekaerts (1984). Fischer 
(1981, p. S91) suggested that a more appropriate test of theories in this field requires 
at *he relationship between inflation and true price dispersion. Domberger 
(1987) takes a step in this direction, examining the behavior of “intramarket’ relative 
pnce variability—the degree that price ratios, for svMtutabU goods, vary over time. 

.. SUgler (1961, p. 218) also argues that search behavior will be different when pro* 
iwnbutions change from period to period: “If asking prices are uneorrelated in sue- 
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inflationary periods, it is possible that store managers will take this 
into consideration in choosing prices. 

The consumer process of determining an optimal search strategy in 
these circumstances is similar to the Arm process of determining an 
optimal investment strategy. Consumers store a stock of information 
about price locations into which there is a flow of new information 
each period (search) and a drainage of old information via forgetting 
and obsolescence. A consumer who has not searched previously and 
who visits n t stores during period t will possess a stock of information, 
l t , equal to some function of n„ f(n t ), depending on how search is 
translated into information. If the consumer had searched in previ¬ 
ous periods and if some information from this earlier search were still 
valid and remembered during t, then /, would be greater than/(«,). 

Consumers determine the optimal stock of information to hold and 
the optimal amount of search to carry out at all times l given (1) the 
cost of search, (2) the process /, and (3) the way information stocks 
depreciate over time. In Van Hoomissen (19876), I prove in detail 
that an increase in the depreciation rate, brought about by an increase 
in the rate of inflation, will cause consumers to hold smaller informa¬ 
tion stocks. Intuitively, the result states that when costly information 
becomes less useful, people will purchase less. The model does not 
necessarily suggest that there will be less search during inflation; 
more search may be necessary to hold a smaller information stock. It 
is the result that the stock of information a person holds will decline 
during inflation that drives the contention that price dispersion will 
increase during an inflationary period. 

The information problems experienced by consumers will be 
shared, in part, by stores. In addition, as shown by Sheshinski and 
Weiss (1977), if there are costs to price adjustment and if firms adjust 
prices independently, there will exist a variance of price changes that 
will increase with the rate of inflation. These things each individually, 
and certainly in tandem, imply a positive correlation between the rate 
of inflation and price dispersion. 

II. Hypotheses, Data, and Econometric 
Specification 

A. Hypotheses and Tests 

The central hypothesis suggested here is that price dispersion will 
increase with inflation. The theory does not suggest, however, that all 

cessive time periods, the savings from search will pertain only to that period, and * ear ^ 
in each period is independent of previous experience. If the correlation °fw“”“ 
prices is positive, customer search will be larger in the initial period than in subsequ 
periods.*' 
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goods will be affected to the same degree; it is possible that frequency 
of purchase and relative expensiveness will affect the strength or 
weakness of the relationship between price dispersion and inflation. 

There is support in the literature for the hypothesis that different 
types of goods will have different degrees of price dispersion. For 
example, Stigler (1966) suggests that expensive goods will have more 
price dispersion than less expensive goods, and Carlson and Pesca- 
trice (1980) find empirical support for the hypothesis that frequently 
purchased goods will have smaller and expensive goods larger price 
dispersion. It is therefore interesting to test whether different goods 
are affected differently by inflation as well. 6 

The hypothesis that price dispersion increases with the market rate 
of inflation is tested by regressing price dispersion on inflation, and 
hypotheses regarding price dispersion for different types of goods 
are tested by examining differences in the regression results for dif¬ 
ferent types of goods, classified according to expensiveness and fre¬ 
quency of purchase. 


B. The Data 

The hypotheses above are tested empirically on Israeli data for the 
years 1971-84. Over this period the monthly inflation rate rose from 
very low and even negative rates to rates in excess of 20 percent per 
month. Over this period the price level rose over 500 times. Hence, 
the Israeli experience provides an especially suitable case for studying 
the relationship between price dispersion and inflation. 

The detailed price data from which a price dispersion summary 
statistic is calculated are drawn from data gathered by the Israel Cen¬ 
tral Bureau of Statistics in its monthly price surveys for the consumer 
price index. 7 For each of 13 goods, the bureau provided all the raw 
data collected between 1971 and 1984. (Statistics on the monthly sam¬ 
ple sizes are provided in table 1.) The sample of 13 goods was selected 
to include goods varying in price relative to average monthly income 
and in frequency of purchase, subject to the availability of data on 
narrowly defined goods that permit unique identification over time. 
Frequency of purchase and relative expense were determined by 
common observation rather than methodical study. The trequent- 
pnrchase items are all food items: beef liver, chicken liver, tea bags, 


(1987jf TOry ex Pk* n *ng difference* by good type may be found in Van Hoomissen 

CPI, the bureau samples the price* of a broad range of consump- 
approximately 1.000,commodities and service*, from about 1 ,500 
tne country. Detailed information about the construction of the 
CW h available In Israel Central Bureau of Statistics (1968). 
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TABLE 1 

Sample Statistics, Monthly Sample Sizes 


Number of Observations per Month 


Goon 

Mean 

Maximum 

Minimum 

Wine 

9.06 

16 

4 

Brandy 

9.26 

14 

4 

Fruit drink concentrate 

21.05 

40 

12 

Falafel sandwich 

8.39 

9 

5 

Instant coffee 

33.09 

54 

18 

Instant cocoa mix 

17.01 

25 

5 

Tea bags 

15.18 

30 

4 

Unpackaged sugar 

25.79 

35 

11 

Chicken liver 

29.01 

64 

9 

Beef liver 

37.38 

71 

22 

Refrigerator 

11.23 

14 

2 

Light bulb 

Child's book 

25.32 

35 

13 

13.49 

26 

5 


sugar, instant coffee, falafel sandwich from a street stand, bottled 
fruit drink concentrate, and wine. The relatively inexpensive but less 
frequently purchased items are brandy, instant hot cocoa mix, light 
bulbs, and a child’s book. The only expensive, infrequently purchased 
item is a refrigerator. 

While the data do not include the name or location of the stores 
sampled, sample sizes remained basically stable from month to month 
and the specific stores sampled changed only at infrequent but irregu¬ 
lar intervals (where 10 stores are sampled monthly, a store might be 
dropped and replaced with another one once every 4 months). Since 
the data clearly show where a store was temporarily omitted and 
where it was permanently replaced, it is possible to measure the de¬ 
gree of price dispersion—or, rather, interstore relative price variabil¬ 
ity—from month to month. Interstore relative price variability lor 
good »(i = 1,..., 13) in month t is the unweighted standard deviation 
of store price movements around the average movement for the good 
during month t. This measure is employed by Vining and Elwer- 
towski (1976) and Domberger (1987), among others, to measure rela¬ 
tive price variability and is defined as 

Vtmj-Lypp# - DPuf, 

’ Nil 

where w* is the number of stores sampled for good t in month t, 
DPijt m ln(Pg,/Pijt~ i) is the logarithmic first difference between the 
prices charged by store j (j = !,...,«„) in periods t and t - 1 ( a P' 
proximately the rate that store j inflates the price of i during month 0. 
and DP a m (1 /n*)2 ; DP,y, can be interpreted as the approximate inna- 
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tion rate in market i in period t. Two time series, V, and DP„ are con¬ 
structed for each good containing the monthly variability and market 
inflation measures. Sample statistics for each V, and DP, may be found 
in table 2. An alternative measure of price dispersion, the coefficient 
of variation, has other qualities and is employed in Van Hoomissen 
(1987ft). 

The data are for the period January 1971 through September 1984 
excluding 1973 and 1975; in October of 1984 a comprehensive price 
freeze was instituted, making further observations unfit for analysis. 
The years 1973 and 1975 are not included in the survey because the 
data were not provided by the bureau, and many of the goods are 
missing several additional years. 8 


C. Econometric Specification and Results 

The model tested here is 

Vu - «, + (3,(DP U ) + ©.(DP,,) 2 + 6,|. 

A quadratic term is added because this specification fits the data bet¬ 
ter in most cases. The system is best tested using Zellner's seemingly 
unrelated regression estimator (SURE) since the equations are theo¬ 
retically related and it is likely that £(«*«,,) ^ 0. In this case generalized 
least squares SUR estimators are more efficient (see also Domberger 
1987). However, it was possible to estimate SURE parameters for only 
six of the 13 goods; 9 hence the joint generalized least squares (JGLS) 
method is used for six of the 13 goods and ordinary least squares 
(OLS) parameters are estimated for the remainder. 


* Six goods are "complete,” containing data for 141 months. The goods with missing 
years are the following: 


Good 

Number of 
Months 

Years Omitted 

Fntit drink 

129 

1972 

Falafel 

128 

1976 

Coffee 

117 

1972, 1976 

Chicken liver 

129 

1971 

Refrigerator 

57 

1971-72, 1974, 1976-77, 1980-81 

u gbt bulb 

81 

1971, 1977, 1978, 1980-81 

Child's book 

105 

1972, 1977-78 


9 Neither of the computer packages used for this analysis would estimate SURE 
P^wneter* for sets of equations containing an unequal number of observations. 
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TABLE 3 

Joint Generalized Least Squares Results 


Good 

Intercept 

DP U 

DPI 

ft* 

F-Value 

Frequent purchase: 
Wine 

.0187 





.9285 

- 1.4790 

.593 

91.89 


(4.142) 

(6.358) 

(1.869) 



Tea bags 

.0258 

.7655 

-1.0834 

.516 

67.22 

(6.093) 

(8.983) 

(4.012) 



Sugar 

.0341 

.2735 

- .2774 

.258 

21.91 

(9.666) 

(6.205) 

(5.413) 



Beef liver 

.0447 

.4158 

- .0670* 

.624 

104.47 


(18.45) 

(5.369) 

(.141) 



Infrequent purchase: 

.0181 


-3.1757 



Brandy 

1.1461 

.680 

133.69 

(5.611) 

(11.90) 

(6.614) 



Cocoa mix 

.0227 

.8960 

-1.8988 

.690 

140.00 


(7.799) 

(14.66) 

(8.722) 




Note.—M utinies arc in parentheses. 

• Significant with less than 90 percent confidence. 


The hypothesis that interstore relative price variability (price dis¬ 
persion) increases with the rate of inflation finds strong support here 
(results of the JGLS and OLS regressions are summarized in tables 3 
and 4, respectively). The results indicate a positive relationship be¬ 
tween price dispersion and the market rate of inflation in all goods 
over most of the range of experienced inflation rates. 

For all goods the estimates of fl, are positive, and for all but one they 
are significantly different from zero. The estimates of 0, are most 
often negative, implying that price dispersion increases with inflation 
at a decreasing rate; in all cases the range at which price dispersion is 
predicted actually to decline with inflation occurs at relatively high 
levels of inflation for which there were few or no observations. For 
only one good, refrigerators, is the evidence of a positive relationship 
between price dispersion and inflation statistically weak. 

There are clear differences in the (3 and 0 estimates for the differ¬ 
ent goods, and the slopes of the estimated functions with respect to 
DP, (measured at the mean of DP,) vary between 0.3 and 0.94. No 
clear or obvious differences appear between the parameter estimates 
for frequently and infrequently purchased goods, 10 but the weak rela¬ 
tionship between inflation and price dispersion for a refrigerator may 
be explained by its expensiveness (which invites extensive consumer 
search) and by the likelihood that retail prices were fixed in U.S. 
dollars for this good over much of the period studied. 


f°r a more extensive analysis by good type, see Van Hoonussen (19876). 
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TABLE 4 

Ordinary Least Squares Results 


Good 

Intercept 

DP„ 

DPI 

R s 

/"-Value 

Slope 

Frequent purchase: 

Wine 

.0168 

(3.638) 

.9962 

(6.588) 

-1.7118 

(2.087) 

.594 

92.53 

.82 

Drink 

.0294 

(8.325) 

.7966 

(14.38) 

-1.1290 

(10.93) 

.648 

105.96 

.66 

Falafel 

.0153 

(4.976) 

1.2566 

(13.96) 

-4.1926 

(7.846) 

.764 

185.05 

.78 

Coffee 

.0229 

(2.262) 

.6774 

(2.982) 

-1.0400* 

(1.108) 

.157 

9.66 

.57 

Tea 

.0233 

(5.368) 

.8498 

(9.586) 

-1.3060 

(4.639) 

.520 

68.16 

.72 

Sugar 

.0320 

(8.751) 

.3391 

(6.790) 

-.3473 

(5.949) 

.268 

23.08 

.30 

Chicken liver 

.0373 

(8,477) 

.6360 

(5.326) 

- .7000* 
(1.188) 

.506 

58.84 

.56 

Beef liver 

.0402 

(15.689) 

.5618 

(6.698) 

- .6287* 
(1.220) 

.638 

111.09 

.50 

Infrequent purchase: 

Brandy 

.0148 

(4.398) 

1.2587 

(12.09) 

-3.4829 

(6.688) 

.685 

137.18 

.94 

Cocoa mix 

.0191 

(6.258) 

.9983 

(14.88) 

-2.1439 

(8.911) 

.696 

144.38 

.78 

Light bulb 

.0191 

(3.581) 

1.2314 

(8.121) 

-3.8005 

(4.764) 

.647 

65.17 

.78 

Book 

.0455 

(8.228) 

.2269 

(2.123) 

.9710 

(2.052) 

.391 

29.89 

.35 

Expensive: 

Refrigerator 

.0428 

(5.490) 

.0348* 

(.203) 

.4105* 

(.462) 

.037 

.94 

.09 


Note.—M utistics are in parentheses. 

* Since die estimating equation ii V, * a, 4- &{DP*) + 0, the estimated slope of V u with respect loW 

SL* - ft + 2® l <D/*g). Here the slope is measured at the mean value of DP* for each good i (see uWe 1). 

* Significant with less than 90 percent confidence. 


III. Conclusions 

The empirical work presented above, based on very detailed store 
level data for 13 goods over a period of 12 years, confirms the theo 
retical hypotheses derived earlier. The rate of inflation was found u 
influence positively and significantly the price dispersion of 12 of the 
13 goods considered here. From the results presented in table 3, a 21 
percent increase in the monthly beef liver inflation rate from 51° h 
percent can be expected to increase the price dispersion of beef I« ver 
by 6.25 percent, from 0.0653 to 0.0694. 

This evidence buttresses the abundant casual evidence that the "h* 
of one price” is not a phenomenon of the real world and underran** 
the explanation that all observed price dispersion is a manifestation o 
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actual or perceived differences in quality, service agreements, or loca¬ 
tion. The empirical evidence cited here shows that price dispersion 
does vary systematically with inflation. 
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Comment 


is Antitrust Enforcement Effective? 

Craig M. Newmark 

North Carolina Stott Uiuutrstty 


In a paper in this Journal, Block, Nold, and Sidak (1981) tried to 
determine the impact of antitrust enforcement on horizontal price 
fixing. After examining data on white bread markups in 20 U.S. 
cities, they concluded that the enforcement activity of the Department 
of Justice against wholesale bakers had both remedial and deterrent 
effects. 

In this paper, I question the authors’ empirical results for three 
reasons. They used Bureau of Labor Statistics (BLS) retail prices in 
computing bread markups; however, the white bread priced by the 
bureau changed quality over the sample period. When a proxy for 
. this quality change is included in the authors’ specification, one deter¬ 
rent effect disappears. Second, the bread markups are sensitive to the 
t pricing decisions made by grocery retailers. When a proxy for retail 
pricing decisions is included in the model, the deterrent effects are 
i weaker. Finally, the significance of the remedial effect depends on a 
: single observation, and this observation was affected by the two prob- 

' lems just stated. 

I. Review of the Block et aL Results 

: Block et al. computed a markup on white bread (price minus esti- 

• mated unit cost, divided by estimated unit cost) for 20 cities, i, for 
most years, t, in the period 1964-76.' They regressed the first differ- 

i I appreciate the helpful comments of Stan Liebowitz, Wally Thurman, and an anony- 

* mou * referee. Scott Crowell provided fine research assistance. I am solely responsible 
> for any errors. 

1 Block et aL computed the bread markups by first subtracting the cost of the major 
ingredients, IC* (flour, sugar, shortening, and dry milk), from the retail bread price, 
These recipe-adjusted prices were then regressed on the prices of electricity and 
natural gas and on a proxy for labor costs. The fitted values from this regression were 
used as estimates of the noningredient costs, NIC*. The authors then computed the 
; bread markup by the formula (1981, p. 456) 

P, - 1C, - NIC, 

” 1C* + NIC* ' 

; ® 1988 5CW^. ADritfuireserve* (X»2-SS«VS8^6064l<p 1161.50 

* 3*5 
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TABLE 1 

Basic Results of Block et al. and Newmakk’s Replications 


Independent 

Variable 

Block et al. 

(1) 

Newmark 

(2) 

ABUDGET 

-.015 

-.013 


(-2.68) 

(-2.94) 

DOJREG 

-.026 

-.032 


(-2.21) 

(-3.10) 

DOJREM 

-.046 

-.046 


(-2.32) 

(-2.39) 

Constant 

.013 

.017 

R* 

.082 

.083 

/■■-statistic 

6.04 

7.34 

Number of observations 

208 

246 


Note.—{- statistics are in parentheses. 


ences of these markups, AAf„, on the following measures of enforce¬ 
ment effort: ABUDGET 0 the change in the real budget of the Anti¬ 
trust Division between year t and year t - 1; DOJREG#, a binary 
variable equaling one for each city within a region in which the Anti¬ 
trust Division filed a price-fixing case against wholesale bakers in year 
t —except for the city incurring the action—and equaling zero other¬ 
wise; 2 and DOJREM#, a binary variable equaling one in year t for a 
city in which a bread industry case was brought in year t - 1 and 
equaling zero otherwise. 

The Block et al. estimates (p. 439, table 2) are reprinted in table 1.1 
was able to replicate, even strengthen, their results. The coefficients 
of ABUDGET, DOJREG, and DOJREM are all negative and signifi¬ 
cant, as predicted by Block et al. (The quantitative differences in the 
estimates are due to minor differences in the data used.) 9 I label the 
impacts on bread markups of ABUDGET and DOJREG the “general 
deterrent” and “regional deterrent” effects and I label the impact of 
DOJREM the “remedial” effect. 


II. Two Problems with the Block et al. Estimates 

A. A Measurement Problem: The Impact of 
Pnvate-Label Brands on Retail Bread Price 

Bread sold under the principal trade names of wholesale bakers was 
designated “advertised-label” or "wholesale baker” brand bread. 

* The regions were defined by the jurisdiction* of the Antitrust Division’s regional 
offices (»ee Block et al. 1978, pp. 60-61), 

8 An unpublished appendix, available on request from the author, detail* the 
aources and computational procedure* used. 
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These brands competed against “private-label” bread. Private-label 
brands were manufactured by some wholesale bakers and some retail 
grocery store chains. For private-label brands, the brand name was 
owned by the grocery retailer, and the retailer promoted and adver¬ 
tised the bread. 4 

Different recipes were used in making the two types of bread. Two 
large bread company executives testified that private-label white 
bread tended to be leaner, containing less milk, shortening, and 
sugar.® Private-label brands were advertised less extensively (see 
Storey and Farris 1964, p. 22; U.S. Federal Trade Commission 1967, 
pp. 30-31; U.S. Council on Wage and Price Stability 1977, p. 21). 
Recent theoretical work suggests that less advertised brands are qual¬ 
itatively different from heavily advertised brands (Klein and Leffler 
1981). It is not surprising, therefore, that private-label brands sold at 
lower retail prices than advertised-label brands. Walsh and Evans 
(1963, pp. 126-27) estimated that private-label brands were dis¬ 
counted an average of 17 percent relative to advertised-label brands. 
The Federal Trade Commission (FTC) (1967, p. 17) reported an 
average discount of 20 percent. The discount for private-label brands 
existed throughout the entire sample period. 

The BLS pricing procedure for white bread did not distinguish, 
however, between advertised-label and private-label brands. The 
brand having the largest volume in a sampled store was priced (Geith- 
man and Marion 1978, pp. 702-3). As a result, BLS retail bread 
prices should be negatively related to the market share of private- 
label brands. 

Empirical support for this conclusion follows. Survey data on the 
market share of private-label brands are available for 18 large cities in 
February 1974 (U.S. Senate 1975, appendix). These shares range 
from a low of 22.1 percent in Buffalo to a high of 61.5 percent in 
Seatde. Regressing the BLS retail bread price for February 1974 
(measured in cents per pound) on the private-label market share 
(measured as a percentage), I obtain ((-statistics are in parentheses) 

bread price * 40.1 - .205 private-label share, 
(l7.2)(-3.42) 

R s = .42, N * 18, mean of dependent variable = 32.5. 


In Matter of International Telephone and Telegraph Carp., et al„ 104 FTC 280 
U984). pp. 293, 302 (ITfcT owned Continental Baking Co., the largest wholesale 
“■wtg firm in the United States). s 

^^(dWteltimony of R. N. LaugMin (p. 6154) and R. A. Jackson(p. 6253)in L.h. 
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Higher private-label market share significantly lowers measured retail 
price. 6 

The market share of private-label brands grew in the 1960s and 
1970s for a reason unrelated to the activities of the Antitrust Division. 
Private labels grew because supermarket chains grew. From 1958 to 
1977 the chains’ share of retail grocery sales rose from 44.0 percent to 
59.6 percent (Marion 1986, pp. 462-63). Large chains were more 
likely than small grocery firms to sell private-label bread (National 
Commission on Food Marketing 1966a, p, 131; 19666, pp. 28-29) 
because bread was manufactured under economies of scale and also 
because the chains were able to distribute bread at lower cost. The 
U.S. Department of Agriculture (1968, p. 12) observed that "in short, 
the wholesale bakery delivery system, developed years ago to meet the 
needs of the neighborhood retail grocery, has become more costly 
relative to sales volumes, and local wholesale bakeries are finding it 
difficult to compete with the high-volume cost-cutting wholesale de¬ 
livery system developed by chainstore bakeries.” Chains also pro¬ 
moted private-label sales by featuring them as loss leaders (see, e.g., 
Walsh and Evans 1963, pp. 102-3). 7 

An expert witness in an FTC proceeding testified that the share of 
private-label brands of U.S. white bread sales, in three selected years, 
was 18 percent in 1960, approximately 36 percent in 1971, and ap¬ 
proximately 50 percent in 1977 (In the Matter of International Telephone 
and Telegraph Corp., et al. [104 FTC 280, 294]). These percentages are 
roughly consistent with information in three other sources. 6 

6 If the market share of the top four brands purchased (as a percentage) is added to 
the model, the following result is obtained: 

bread price = 68.6 - .343 private-label share 

(9.71) (-6.37) 

- .388 share of top four brands, R 3 =* .73. 

(-4.14) 

Two cost-of-living proxies, per capita disposable income and truck driven' wages, were 
not significant. See n. 9 for further discussion of the effect of private-label share on the 
BLS retail bread price. 

7 One can test the hypothesis that the chains’ growth, rather than antitrust activity, 
promoted private-label sales. 1 regressed the market share of private-label brands in IS 
cities in February 1974 on two variables. (Private-label market shares are reported in 
U.S. Senate 1975, appendix.) One variable was a dummy variable for the filing of a 
bread price-fixing case in thin city during 1957-74. The other variable was a 1972 
Herfindahl index for city grocery store concentration. The latter variable proxied the 
degree to which chains controlled grocery sales in a city. I found that while the enforce¬ 
ment dummy was positively related to private-label share, it had a (-statistic of only 0.90. 
The chain sates proxy was also positively related to private-label share, but in contrast 
with the enforcement dummy, it had a (-statistic of 2.12. 

8 Walsh and Evans (1963, p. 100) cited an estimate that 25 percent of total breads** 5 
in 1958 were private-label. An (unweighted) average of private-label share in Ificttic* 
for February 1974 equated 37.6 percent. Connor and Wills (1988, pp. 121,126) f^P 0 * 1 
that the thate of private-label brands during 1980-82 was 53 percent. 
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Since increases in private-label share coincided with increases in the 
budget of the Antitrust Division, the general deterrent effect ob¬ 
served by Block et al. could be simply an artifact of the retail price 
data. In particular, the fall in measured bread markups in the mid- 
1970s (see table 2) probably resulted from an increase in private-label 
sales. 9 


B. A Specification Problem: Grocery Retailers’ Pricing 
Strategies Could Affect Bread Markups 

Some changes in measured bread markups could have been caused, 
not by changes in bakers’ prices, but by changes in the pricing deci¬ 
sions of grocery retailers. Periods of especially intense competition 
among supermarket chains occurred in the 1970s. For instance, retail 
grocers’ margins plummeted in most areas of the country in 1972 as a 
result of A & P’s drastic price cutting (U.S. Federal Trade Commis¬ 
sion 1975). The years 1975 and 1976 also witnessed price wars in 
some areas (Forbes, August 1, 1975, pp. 36-37; Business Week, April 5, 
1976, pp. 76-77). Grocers, not bakers, might have been responsible 
for the lower measured bread markups in these three years. 10 

Falling retail markups in 1972 are potentially important for the 
coefficient of DOJREG because nine of the 25 observations on the 
regional effect (observations for which DOJREG - 1) are from 1972. 

My reason for including a control for retail pricing strategy is anal¬ 
ogous to the reasoning of Block et al. They tested variables that con¬ 
trolled for “general year-to-year variations in manufacturing mark¬ 
ups that might not be adequately controlled by the first-difference 
procedure” (1981, p. 440). Since bread was sold by grocery' retailers, it 
is logical also to check if changes in retailing markups affected their 
results. 


The negative effect of private-label market share on the BLS retail price should be 
wntmear. Once the bureau selects a particular brand to be sampled in a given outlet, it 
continues to sample the same brand unless another brand outsells the sampled brand 
by roughly two to one (U.S. Department of Labor 1966, pp. 68-69). Increases in 
private-label share will only lower the BLS retail price in a given city when private-label 
share becomes large enough that the bureau begins to sample these brands rather than 
“dvenised-label brands. An inspection of BLS retail prices and private-label market 
shares for 18 cities in February 1974 suggests that the "critical'' private-label share was 
w the range of 42—45 percent. These data are shown in the unpublished appendix to 
P*per. Also in the appendix is an analysis of how the increase in private-label share 
ttom 36 percent to 50 percent during 1971-77 can explain most of the change m 
SPREAD during those yEre J , . 

Even if competition also lowered the retail margins on flour, sugar, and oU-'-tne 
fetad prices of which were subtracted from the numerator of the markup—ihe bread 
"»rkup was trill likely to be affected. The reason is that the cost of these three ingre- 
**nts was equal to only about one-third of the retail bread price. 
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TABLE 2 

Values or Mean AM, ABUDGET, ASPREAD, and AGROCPROF 


Year 

Mean AM 

ABUDGET 

ASPREAD 

AGROCPROF 

1965 

.0072 

.3803 

.1 

-.26 

1966 

.0293 

-.1019 

.2 

-.17 

1967 


.1133 

.1 

-.36 

1968 

.0117 


-.1 

.16 

1969 

.0268 

.1018 

.2 

-.17 



1.0142 

1.0 

-.13 

1971 

-.0151 

.5128 

-.2 

-.17 

1972 

-.0683 

.6573 

-.8 

-.62 

1973 

.0127 


.8 

.16 

1974 

.0499 

.3696 

.4 

.15 

1975 

-.0083 


-1.2 

-.08 

1976 


1.7202 

—1.2 

-.05 

1977 

-.0089 

2.2216 

-.3 

-.28 


Source-—S ec appendix, available from author. 

Note —BUDGET u measured in millions of 1967 doUan; SPREAD is measured in cents per pound; GROC- 
PROF is equal to profit before income tax divided by sales for a sample of supermarket chains. 


III. An Extended Model 

A. Additional Variables 

To control for the retail decision and private-label share effects on 
bread markups, I add two variables, AGROCPROF, and ASPREAD,. 
to the Block et al. specification. Values for these variables are pre¬ 
sented in table 2. The variable GROCPROF equals the ratio of profit 
before income tax to sales for a sample of grocery store chains; 
AGROCPROF is the first-differenced series. 

I lack complete time-series data on private-label share, so 
ASPREAD is used as a proxy. The variable SPREAD equals the na¬ 
tional retail bread price (BLS measure) minus the national wholesale 
bread price (Department of Agriculture measure). The Agriculture 
Department’s measure of wholesale price reflected the price of adver- 
tised-label brands only (Schnake 1981, p. 12; U.S. Department of 
Agriculture 1977, p. 25). After one controls for changes in retailers’ 
margins with AGROCPROF, ASPREAD should proxy changes in pri¬ 
vate-label share. 11 Specifically, increases in private-label share will 


11 If grocery retailers changed their pricing or marketing of bread given a constant 
wholesale price, SPREAD would change. But AGROCPROF should control for change* 
m retailers’ decisions if two assumptions hold: changes in retailers' bread margins were 
positively correlated with changes in their overall margins, and changes in their overall 
margins were reflected in their profit to sales ratios. Neither assumption is necessarily 
true. But if these assumptions are incorrect and AGROCPROF does not control w f 
changes in retailers’ decisions, my results are not biased against the findings of Bloc* ^ 
al. At worst, ASPREAD might then reflect some changes in retailers’ bread margin* *•' 
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TABLE S 

Estimates of the Extended Model 


Independent 

Variable 

(1) 

(2) 

(5) 

(4) 

abudget 

dojreg 

-.013 

(-2.94) 

-.032 

-.0003 

(-.06) 

-.022 

-.007 

(-1.63) 

-.018 

.0014 

(.30) 

- fi|? 

dojrem 

(-3.10) 

-.046 

(-2.10) 

-.041 

(-1.85) 

-.052 

(-1.26) 

- ,048 

ASPREAD 

(-2.39) 

(-2.20) 

.024 

(-2.95) 

(-2.74) 

.017 

AGROCPROF 

. . . 

(4.54) 

.093 

(3.34) 

.082 

Constant 

.017 

.009 

(6.51) 

.024 

(5.67) 

.018 

R* 

.083 

.156 

.220 

.255 

F-statistic 

7.34 

11.09 

17.04 

16 44 

Number of observations 

246 

246 

246 

246 


Note.—M utinies arc in parentheses. 


cause SPREAD to decline (U.S. Department of Agriculture 1977, p. 
25). Lower SPREAD values might also be caused by unmeasured 
discounts from the list wholesale price. The evidence available on this 
possibility suggests that this problem is not significant, though. 12 

B. Empirical Results 

Estimates of the extended model are shown in table 3. The table 
shows the Block et al. specification, the results from adding either 
AGROCPROF or ASPREAD separately, and the results from includ¬ 
ing both AGROCPROF and ASPREAD. 


we aschanges in private-label market share. Note, though, that both types of changes 
must be controlled for if changes in retail prices are to represent changes in wholesale 
th^icrt/L. ^ intend. (And there is evidence that changes in SPREAD during 

** Th an ** * ttr *buted to changes in private-label market share; see n. 9.) 

, inhere are three reasons to believe that unmeasured discounts from the list 
esale price did not affect SPREAD significandy: (1) In estimating the wholesale 
price, the Department of Agriculture (1976, p. 46) tried to include discounts from the 
, P™* - The U.S. Council on Wage and Price Stability (1977, pp. 15,40) estimated 
w ° e, r* tnuuact * on * prices for 1972-75 by surveying bakers. The council's prices 
^* ose 10 Agriculture’s figures, close enough that when I reestimate the model 
P an ? SPREAD with a modified measure that used the council’s data, I find no 
j . tan, JY* jhange in the results (see the appendix). (5) If the sharp drop in SPREAD 
bra i? l . was c * u| ®d by bakers who were discounting advertised-Iahel 

lo, a *> *heir profits should have decreased. But bakers' profits were much higher in 
197 r (U.S. Council on Wage and Price Stability 1977, pp. 48-52). Their 

jggj ^1^2) nUr ^* n * WCre e 9 u *^ *° t h«r 1975 margins (U.S. Federal Trade Commission 
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These estimates support two conclusions. First, either individually 
■ together, AGROCPROF and ASPREAD greatly improve the fit of 
le model, enter the model with the expected positive signs, and are 
gnificant at the .01 level. Second, if included one at a time, AGROC- 
ROF and ASPREAD weaken the significance of either one or two of 
ne antitrust variables, and if they are entered together, they greatly 
weaken the significance of both the general and regional deterrent 
iffects: ABUDGET has virtually no effect; DOJREG’s coefficient is 
■educed 60 percent and is not significant at the .10 level (two-tailed 
est). When AGROCPROF and ASPREAD are added to other spec¬ 
ifications examined by Block et al., 1 obtain similar results. 19 

C. The Own-City Remedial Effect 

The remedial effect of antitrust enforcement, measured by the 
coefficient of DOJREM, is little changed in the extended model. How¬ 
ever, this effect is based on a small number of observations: DOJREM 
is equal to one for only seven of the 246 observations in the sample. 

To further investigate the remedial effect, I disaggregated DOJ¬ 
REM: 1 reestimated the extended model replacing DOJREM with 
seven dummy variables. Each dummy variable equaled one for a city 
and year in which DOJREM equaled one. One of the seven dummy 
variable coefficients was positive, and three others were statistically 
very close to zero. Only one of the seven coefficients, the one for 
Baltimore in 1972, was negative and significant at the .05 level. The 
large negative /-statistic for Baltimore, -5.28, suggested that the 
significance of DOJREM depended on this single observation. 

This suggestion is confirmed by results reported in table 4: if the 
Baltimore 1972 observation is dropped from the data, the /-value for 
DOJREM in the extended model is -0.89 (vs. -2.74 with the obser¬ 
vation included). If the AM value for this observation is raised by 50 
percent, from -.2704 to -.1352—it is, even then, the second most 
negative AM value in the entire sample—the /-value for DOJREM is 
-1.66, shy of significance at the .05 level (two-tailed test). 

Why was the Justice Department’s prosecution of price fixing so 
much more effective in Baltimore than prosecutions in the six other 
cities? Circumstantial evidence suggests that the Baltimore effect was 



'* Block et al. observed (1981, p. 444) that "government-imposed price-fixing penal- 
ties were tririal.” They tested the hypothesis that the deterrent effects of antitrust 
activity became significant only after large private class actions developed. They also 
examined how mid-1970s legal developments—such as the Eiun IV decision and the 
Hart-Scott-Rodino Antitrust Improvements Act—affected their results. My 
of these specifications, as well as several others that Block et al. examined, are reported 
in the unpublished appendix to this paper. 
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TABLE 4 

The Extended Model with Chances Made to Baltimore 1972 Observation 


Independent 

Variable 

(1) 

(2) 

(3) 

ABUDGET 

.0014 

.0012 

.0010 


(.30) 

(.25) 

(.22) 

DOJREG 

-.012 

-.014 

-.014 

(-1.26) 

(-1.43) 

(-151) 

DOJREM 

-.048 

-.028 

-.016 

(-2.74) 

(-1.66) 

(-.89) 

ASPREAD 

.017 

.017 

.017 


(3.34) 

(3.39) 

(3.37) 

AGROCPROF 

.082 

.074 

.070 


(5.67) 

(5.35) 

(4.98) 

Constant 

.018 

.017 

.017 

R 1 

.255 

.240 

.224 

F-statistic 

16.44 

15.19 

13.79 

Number of observations 

246 

246 

245 

F-statistics for 




all three antitrust 




coefficients = 0 

3.10 

1.64 

1.08 


Note — t-ataimks sire in parenthese*. Column 1 repeats the results listed in rot. 4 of table 3. In rol. 2, AM for 
Baltimore 1972 has been set to half its actual value. In col. 3, the observation for Baltimore 1972 is omitted. 


an artifact of an increase in private-label market share. The share 
increase was probably caused by a change in the pricing strategy of 
Baltimore’s grocery retailers. The same data problems that cast doubt 
on the regional and general deterrent effects also raise a question 
about the remedial effect in Baltimore. 14 

The FTC staff studied the Baltimore bread market and presented 
its findings in the 1967 Economic Report on the Baking Industry. The staff 
observed (pp. 92-93) that the Baltimore bread market was unusual: 

The pattern of consumer preference [in Baltimore] was such 


11 The city showing the second-largest remedial effect—a poor second to Balti¬ 
more—is San Diego in 1976. There are two interesting facts about the apparent reme¬ 
dial effect there. The defendants received a directed innocent verdict on the criminal 
price-fixing charges (Commerce Clearing House 1967-77, case 2465). This outcome 
Was highly unusual: the government wins the overwhelming majority of the price- 
fixing cases it brings. If there was not satisfactory evidence of price fixing, we should 
question whether the government's action could have had any remedial effect. This is 
especially true in the light of a second fact. A severe retail grocery price war broke out 
m Los Angeles in 1976 (Business Week , April 5,1976, pp. 26-27; Wall Street Journal, July 
1®, 1976, p. 1). The war apparently extended to San Diego. It is reflected in the 
consumer price index for food at home: the average increase in this index for 1975-76 
for 20 non-Southern California cities was 2.25 percent; Los Angeles showed only a 0 I 
poccern increase, and San Diego's index fed 0.1 percent. These two percentages are the 
increasesIn the entire sample and are significantly different from those of the 
ojher cities (t-ratios of -2.34). So, as in Baltimore, the effect of retail grocers pricing 
casts doubt on the remedial effect. 
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that private label sales by food chains in Baltimore were re¬ 
stricted relative to other cities. 

Because of the importance of wholesale bakers brands, 
which are generally higher-priced, the BLS average price of 
bread for Baltimore was high compared with other cities. 
Even though the average prices of both wholesale baker 
brands and private labels were not significantly different 
from, and often lower than, their counterparts in other 
cities, the great weight (volume of sales) of the higher-priced 
brands caused the overall price to be higher. 

The report illustrated this analysis by comparing Baltimore’s retail 
price to the retail price in Washington, D.C. In August 1966, the 
Baltimore BLS price was 3.7 cents per pound higher than the Wash¬ 
ington BLS price. Yet the report noted that "the average prices of 
private label bread in the two cities were almost the same while the 
average of wholesale baker brand bread in Baltimore was only about 
0.4 cent a pound higher” (p. 93). 

According to the Justice Department, in August 1966 the price¬ 
fixing conspiracy was in full swing (Block et al. 1978, p. 84). The 
finding that the 1966 wholesale-label price in Baltimore was not 
significantly different from wholesale-label prices in Washington, 
D.C., and other cities strongly suggests that the Baltimore conspiracy 
failed to raise prices 

If so, what explains the remedial effect apparently observed in 
Baltimore following the Justice Department’s complaint in June 
1971? From 1971 to 1972, while the U.S. average bread price dipped 
0.3 cents per pound, the Baltimore price dropped 5.2 cents pet- 
pound. The information in the FTC report raises the possibility that a 
decline, even of this magnitude, could have been just an artifact of the 
BLS’s procedure for measuring bread prices. If the wholesale-label 
and private-label brand weights used in Washington had been applied 
in Baltimore in August 1966, the Baltimore BLS price would have 
been at least 3.3 cents per pound lower. This 3.3-cent difference can 
explain most of the 5.2-cent decline seen during 1971-72; thus an 
increase in private-label market share sufficient to change Baltimore’s 
weights to Washington’s could account for the observed “remedial” 
effect. 

The FTC staff predicted that the market share of private-label 
brands in Baltimore would increase (p. 94). But the timing of the big 
decline in Baltimore’s price, during 1971-72, seems to favor the anti¬ 
trust effect theory. Another explanation of the timing of the price 
decline is available, however, an explanation that relies on the un- 
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usual structure of Baltimore’s grocery retail market combined with a 
change in grocery retailing that occurred in 1972. 

Baltimore was one of the few major cities in the country in which 
the A & P supermarket chain held a leading market share of the retail 
grocery market. In 1972 A & P converted its stores to the Warehouse 
Economy Outlet (WEO) format in which a “significantly narrower 
selection of food products” were available at very large discounts 
(Fruhan 1979, p. 216). The WEO conversion took place at different 
speeds in different regions of the country, and the size of the dis¬ 
counts also apparently varied by region. The mid-Atlantic Coast re¬ 
gion was the region in which the conversion occurred most rapidly 
and in which discounts were evidently most severe. 15 The A & P 
oudets in Baltimore, as part of the WEO program, may well have 
stocked less of the advertised-label bread or given it less attractive 
shelf space, resulting in a larger share for A 8c P’s own private label. 
(On the importance of good shelf space, see U.S. Federal Trade Com¬ 
mission [1967, p. 50].) And any action taken by A & P probably would 
have affected the entire market: a leading Baltimore baker identified 
A & P as the price leader in the market (U.S. Federal Trade Commis¬ 
sion 1967, p. 91, n. 50). 

Further, while the Baltimore bakers were charged in June 1971, by 
December 1971 the Baltimore BLS price had not declined. It was 
even slightly higher, 30.3 cents per pound compared to 30.1 cents per 
pound in June. The Baltimore price fell a little in January 1972, but 
the big decrease, 6.5 cents per pound, came only in February. Febru¬ 
ary 1972 was also the month in which A & P advertised the conversion 
of its Baltimore stores to WEO outlets (Baltimore Sun, February 20, 
1972). 16 The fall in Baltimore’s price was much closer in time to the 
WEO conversion than to the antitrust complaint. 17 

14 Business Week (May 20. 1972, p. 72) discusses how WF.O conversions proceeded at 
varying speeds in different regions of the country. Data reported in Business Week 
(September 30, 1972, pp. 56-57) and in an FTC report (1975. pp. 30-32) suggest that 
the impact of WEO conversion was more severe in the East, especially in the mid- 
Atlantic Coast area. I tested whether 1972 AM, values were a function of A & P’s retail 
market share. Ia market share had the expected negative sign, but it was very 
insignificant statistically. The test does not control, however, for the speed of adoption 
and severity considerations discussed above. 

16 The advertisement stated that the conversion would be effective March 13. But a 
change in the stocking or placement of advertised-label brands could well have oc¬ 
curred before March 13. A & P began preparing its stores for the WEO campaign at the 
end of 1971 (U.S. Federal Trade Commission 1975, p. 17). 

Block et al. contended that “for strategic reasons.” price fixers woujd watt 1 year 
after the filing of a case before lowering their markups. 1 he authors did not specify 
these reasons. Whatever reasons they had in mind, such a delay could be costly. If the 
price fixers' customers later sue for damages and win, the delay will have lengthened 
the period over which the customers were harmed. This would increase the damages 
that the price fixers would be Forced to pay. 
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IV. Conclusion 

This paper showed that the results of the novel attempt by Block et al. 
to measure the effects of antitrust enforcement are questionable. 1 
showed that changes in grocery retailers’ margins and in a proxy for 
private-label market share explain most of the regional and general 
deterrent effects. The significance of the remedial effect depends on 
a single observation, an observation I argued was affected by a mea¬ 
surement problem. 

The implication that antitrust enforcement had doubtful remedial 
effect is surprising in view of the large penalties price-fixing firms 
paid in class action suits. One explanation for the apparent absence of 
a strong remedial effect is that attempts to fix prices failed. This result 
could have occurred because executives overestimated their ability to 
make conspiracies work {Posner 1976, p. 41) or because the price¬ 
fixing conspiracies detected were not intended to raise prices (Dewey 
1979, 1982). 

Another explanation consistent with the small remedial effect ob¬ 
served is that conspiracies were very effective against a few customers 
while ineffective against most customers. Both theory and empirical 
evidence indicate that conspiracies against governmental or regulated 
buyers are unusually effective, particularly if sales are made through 
sealed bids (Newmark (1988] gives references). Of the 10 conspiracies 
in the Block et al. sample, five allegedly involved bid rigging against 
governmental buyers; in two of the remaining conspiracies, bid rig¬ 
ging was not specifically alleged, but sales to public institutions were 
cited (Commerce Clearing House 1967-77). Given a decision to fix 
prices against governmental buyers, an attempt to fix prices against 
other customers should have had lower marginal costs than other¬ 
wise. But it is plausible that price fixing against supermarket chains 
failed, given the chains’ relatively high concentration at the local level 
and their ability to produce bread at their own plants. 18 
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Book Review 


Deregulation and the Future of Intercity Passenger Travel. By John R. Meyer and 
Cunton V. Oster, Jr., with John S. Strong, Jos6 A, Gomez-Ibanez, Don 
H. Pickrell, Marni Clippinger, and Ivor P. Morgan. 

Cambridge, Mass.: M.I.T. Press, 1987. Pp. xviii + 294. $27.50. 


The deregulation of U.S. domestic transportation industries, which occurred 
during the 1978-82 period, has now generated almost a decade of experi¬ 
ence with unfettered competition for passengers and freight. This experience 
has been a rich source of economic research, of which the present book is the 
most recent example. 

Among studies of deregulation in passenger transportation, the present 
one is unique in that it applies a fully intermodal perspective. In other words, 
it not only is concerned with the effects of deregulation on airlines (the topic 
of most other such studies) but also examines what has happened in other 
modes of intercity passenger transportation, including buses (for which a 
regulatory loosening occurred in 1982), Amtrak, and the private automobile 
(which remains by far the most popular mode of intercity passenger travel). 

Airlines remain the most important part of the story to tell here, and the 
first eight substantive chapters of the book deal with that mode. Each of these 
chapters views the effects of air deregulation from a different perspective, 
including each of the following topics: airline financial performance, the 
behavior of new entrant firms, the behavior of established firms, productivity 
and labor markets, effects on fares and service, effects on travel agencies, and 
effects on international markets for air transportation. 

The second part of the book deals with the effects of deregulation on 
competition for other modes of transportation, that is, rail, bus, and auto. 
Central to this comparison is an analytical comparison of costs and demand 
for each of the modes, with a suggestion for future directions that market 
equilibrium will take in intercity passenger travel. 

Finally, there are two important appendices. The first provides a detailed 
intermodal cost comparison of the air, rail, bus, and auto modes. The second 
provides some econometric models and estimates for intercity passenger de¬ 
mand (the models developed and estimated are of two types: one is based on 
aggregate time-series analysis and another is based on disaggregated cross- 
sectional analysis). 

The conclusions of such a broad study are many and difficult to summarize 
in a short review such as this. The most important conclusion is that the study 
is supportive of nearly all the regulatory changes that have occurred over the 
last decade. It documents that while a few travelers might pay higher fares as 
a result of air deregulation, most travelers seem to have benefited in fares, 
and, furthermore^ airline “service has either improved or held constant in 
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most markets since deregulation” (p. 124). The study also concludes that 
those international routes that have seen relaxation of regulatory restrictions 
have similarly enjoyed lower fares with no worse service. 

In die intermodal area, the authors note that the lower plane fares (espe¬ 
cially for price-sensitive traffic) caused by air deregulation have caused a shift 
away from the private auto and toward air. They note, furthermore, that if 
subsidies to Amtrak were eliminated, other modes (perhaps particularly 
buses) would stand to benefit. 

The authors are particularly pessimistic about intercity rail passenger trans¬ 
portation. They note that intercity rail transport fares poorly in a cost com¬ 
parison with bus or, on longer hauls, air transportation, and they note that 
the combination of economy and convenience provided by the private auto 
will continue to assure its dominant share in the intercity travel market for the 
foreseeable future. 

An important general conclusion of the book is that in every sector of 
intercity passenger transportation, deregulation has improved the respon¬ 
siveness of the market to travelers’ needs, a point documented especially for 
air and bus. 

Overall, 1 find this a very good and informative book. The part on airlines 
provides new evidence and perspectives on airline deregulation, but its 
findings are also reasonably consistent with those of over half a dozen previ¬ 
ous studies (which are too numerous to cite in the short space of this review). 
The portion of the book on rail, bus, auto, and intermodal comparisons 
represents an area of research that has not been pursued (to my knowledge) 
since transportation deregulation occurred, and a new analysis is especially 
welcome (preceding studies are Meyer et al. [1959] and Keeler [1971], both 
done a long time ago). 

Although the overall quality of die book is quite good, it is possible to find 
details of scholarship with which to disagree. An example of this is the inter- 
modal cost comparison carried out in appendix A and on which some of the 
conclusions of the main body of the book are based. Specifically, as is well 
known, a large part of the cost of a trip by public transportation varies with 
space provided rather than passengers using it. Thus the estimates of costs 
per seat-mile or per passenger-mile are likely to be quite sensitive to the 
assumed seating density (i.e. t a vehicle that has more seats packed in tightly 
will have substantially lower costs than a vehicle with roomier seats spaced 
further apart). 

It is thus worrisome to find that the authors seem to have allowed 20-40 
percent more seating space for rail travelers than for bus or air travelers. 
They assume a standard 47-passenger bus for which (as is documented in 
Keeler [1971]) required seat size is roughly 18 inches wide and S2-34 inches 
pitch. It is documented similarly that the aircraft configurations they assume 
(such as a DC9-30 with 115 passengers) are consistent with similar seating 
densities. But the railcars that they assume seat 64-76 passengers and, at 
Keeler (1971) documented in detail, standard railcars of the dimensions and 
capacities indicated in this book would seat at least 110 passengers per car if 
they had the same seating densities as a bus or air coach. Thus with the same 
seating configurations, rail travel would be considerably cheaper per seat-mile 
or passenger-mile than the figures in this book would imply. Indeed, for the 
two-car, self-propelled train set they consider, they assume a capacity of only 
128 passengers, whereas in a bus or airlike configuration, the train set wouM 
seat 220. Since very few of the costs vary with passengers carried, but rather 
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only with capacity provided, it is clear that the costs of the train set would be 
dramatically lower if seating configuration were standardized. 

It may be that the typical existing railcar has a lower density, presumably 
because rail passengers like more space (though it is certainly feasible to 
configure short-haul passenger trains in this high-density configuration; the 
high-speed trains in Japan do so). But then, I believe, this difference in 
service quality should at least be pointed out to the reader. 

For the most part, this scholarly detail is unlikely to make much difference 
in the conclusions of this book. On most routes, especially long-haul and low- 
density ones, intercity rail passenger service is simply not economic (and has 
not been for some decades; Meyer et al. first pointed this fact out in 1959). 
And even on short-haul, high-density routes, especially given current wages 
and work rules, rail passenger service in this country does not appear to be 
particularly attractive economically, from either a private or social viewpoint. 
But the comparison would still be clearer (and cenainly more favorable to 
rail) if it were based on consistent assumptions regarding passenger space. 

Another disagreement relates to the air travel demand model based on 
time-series data and reported in appendix Bl. It would argue that it should 
include a measure of service quality in addition to variables representing fares 
and economic conditions. 

While these criticisms are not totally trivial, they also do not in any way 
detract from the basic points in the book, and they are perhaps suggestive of 
the detailed level of analysis at which my disagreements occur. 

Overall, this is a very good and highly informative study, recommended to 
anyone desiring to keep up on the economics of intercity passenger transpor¬ 
tation in the late 1980s. 

Theodore E. Keeler 

University of California, Berkeley 
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